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Description 

METHOD OF DETERMINING THE 

NUCLEOTIDE SEQUENCE OF 
OLIGONUCLEOTIDES AND DNA 

MOLECULES 

Cross Reference to Related Applications 

[0001] This patent application is a continuation of Serial No. 

09/941,882 filed August 28, 2001, which is a continua- 
tion-in-part of Serial No. 09/673,544 filed February 26, 
2001 and now abandoned, which is a continuation- 
in-part of PCT/US99/09616 filed April 30, 1999 and 
claims the benefit of provisional application Serial No. 

60/083,840 filed May 1, 1998. 
Introduction 

[0002] The present invention relates to a novel method for ana- 
lyzing nucleic acid sequences based on real-time detec- 
tion of DNA polymerase-catalyzed incorporation of each 
of the four deoxynucleoside monophosphates, supplied 



individually and serially as deoxynucleoside triphosphates 
in a micro fluidic system, to a template system comprising 
a DNA fragment of unknown sequence and an oligonu- 
cleotide primer. Incorporation of a deoxynucleoside- 
monophosphate (dNMP) into the primer can be detected 
by any of a variety of methods including but not limited to 
fluorescence and chemiluminescence detection. Alterna- 
tively, microcalorimetic detection of the heat generated by 
the incorporation of a dNMP into the extending primer us- 
ing thermopile, thermistor and refractive index measure- 
ments can be used to detect extension reactions. The 
present invention further provides a method for monitor- 
ing and correction of sequencing errors due to misincor- 
poration or extension failure. 
[0003] The present invention provides a method for sequencing 
DNA that avoids electrophoretic separation of DNA frag- 
ments thus eliminating the problems associated with 
anomalous migration of DNA due to repeated base se- 
quences or other self-complementary sequences which 
can cause single-stranded DNA to self-hybridize into 
hairpin loops, and also avoids current limitations on the 
size of fragments that can be read. The method of the in- 
vention can be utilized to determine the nucleotide se- 



quence of genomic or cDNA fragments, or alternatively, as 

a diagnostic tool for sequencing patient derived DNA 

samples. 
Background of Invention 

[0004] Currently, two approaches are utilized for DNA sequence 
determination: the dideoxy chain termination method of 
Sanger (1977, Proc. Natl. Acad. Sci 74:5463-5674) and 
the chemical degradation method of Maxam (1977, Proc. 
Natl. Acad. Sci 74:560-564). The Sanger dideoxy chain 
termination method is the most widely used method and 
is the method upon which automated DNA sequencing 
machines rely. In the chain termination method, DNA 
polymerase enzyme is added to four separate reaction 
systems to make multiple copies of a template DNA 
strand in which the growth process has been arrested at 
each occurrence of an A, in one set of reactions, and a G, 
C, or T, respectively, in the other sets of reactions, by in- 
corporating in each reaction system one nucleotide type 
lacking the 3'-OH on the deoxyribose at which chain ex- 
tension occurs. This procedure produces a series of DNA 
fragments of different lengths, and it is the length of the 
extended DNA fragment that signals the position along 
the template strand at which each of four bases occur. To 



determine the nucleotide sequence, the DNA fragments 
are separated by high resolution gel electrophoresis and 
the order of the four bases is read from the gel. 

[0005] a major research goal is to derive the DNA sequence of 
the entire human genome. To meet this goal the need has 
developed for new genomic sequencing technology that 
can dispense with the difficulties of gel electrophoresis, 
lower the costs of performing sequencing reactions, in- 
cluding reagent costs, increase the speed and accuracy of 
sequencing, and increase the length of sequence that can 
be read in a single step. Potential improvements in se- 
quencing speed may be provided by a commercialized 
capillary gel electrophoresis technique such as that de- 
scribed in Marshall and Pennisis (1998, Science 
280:994-995). However, a major problem common to all 
gel electrophoresis approaches is the occurrence of DNA 
sequence compressions, usually arising from secondary 
structures in the DNA fragment, which result in anoma- 
lous migration of certain DNA fragments through the gel. 

[0006] As genomic information accumulates and the relation- 
ships between gene mutations and specific diseases are 
identified, there will be a growing need for diagnostic 
methods for identification of mutations. In contrast to the 



large scale methods needed for sequencing large seg- 
ments of the human genome, what is needed for diagnos- 
tic methods are repetitive, low-cost, highly accurate tech- 
niques for resequencing of certain small isolated regions 
of the genome. In such instances, methods of sequencing 
based on gel electrophoresis readout become far too slow 
and expensive. 

[0007] when considering novel DNA sequencing techniques, the 
possibility of reading the sequence directly, much as the 
cell does, rather than indirectly as in the Sanger 
dideoxynucleotide approach, is a preferred goal. This was 
the goal of early unsuccessful attempts to determine the 
shapes of the individual nucleotide bases with scanning 
probe microscopes. 

[0008] Additionally, another approach for reading a nucleotide 
sequence directly is to treat the DNA with an exonuclease 
coupled with a detection scheme for identifying each nu- 
cleotide sequentially released as described in Goodwin, et 
al., (1995, Experimental Techniques of Physics 
41:279-294). However, researchers using this technology 
are confronted with the enormous problem of detecting 
and identifying single nucleotide molecules as they are di- 
gested from a single DNA strand. Simultaneous exonucle- 



ase digestion of multiple DNA strands to yield larger sig- 
nals is not feasible because the enzymes rapidly get out of 
phase, so that nucleotides from different positions on the 
different strands are released together, and the sequences 
become unreadable. It would be highly beneficial if some 
means of external regulation of the exonuclease could be 
found so that multiple enzyme molecules could be com- 
pelled to operate in phase. However, external regulation 
of an enzyme that remains docked to its polymeric sub- 
strate is exceptionally difficult, if not impossible, because 
after each digestion the next substrate segment is imme- 
diately present at the active site. Thus, any controlling 
signal must be present at the active site at the start of 
each reaction. 

[0009] a variety of methods may be used to detect the poly- 
merase-catalyzed incorporation of deoxynucleoside 
monophosphates (dNMPs) into a primer at each template 
site. For example, the pyrophosphate released whenever 
DNA polymerase adds one of the four dNTPs onto a 
primer 3' end may be detected using a chemiluminescent 
based detection of the pyrophosphate as described in Hy- 
man E.D. (1988, Analytical Biochemistry 174:423-436) 
and U.S. Patent No. 4,971,903. This approach has been 



utilized most recently in a sequencing approach referred 
to as "sequencing by incorporation" as described in Ron- 
aghi (1996, Analytical Biochem. 242:84) and Ronaghi 
(1998, Science 281:363-365). However, there exist two 
key problems associated with this approach, destruction 
of unincorporated nucleotides and detection of pyrophos- 
phate. The solution to the first problem is to destroy the 
added, unincorporated nucleotides using adNTP-di- 
gesting enzyme such as apyrase. The solution to the sec- 
ond is the detection of the pyrophosphate using ATP sul- 
furylase to reconvert the pyrophosphate to ATP which can 
be detected by a luciferase chemiluminescent reaction as 
described in U.S. Patent No. 4,971,903 and Ronaghi 
(1998, Science 281:363-365). Deoxyadenosine a- thiot- 
riphosphate is used instead of dATP to minimize direct in- 
teraction of injected dATP with the luciferase. 
[0010] Unfortunately, the requirement for multiple enzyme reac- 
tions to be completed in each cycle imposes restrictions 
on the speed of this approach while the read length is 
limited by the impossibility of completely destroying un- 
incorporated, non-complementary, nucleotides. If some 
residual amount of one nucleotide remains in the reaction 
system at the time when a fresh aliquot of a different nu- 



cleotide is added for the next extension reaction, there 
exists a possibility that some fraction of the primer 
strands will be extended by two or more nucleotides, the 
added nucleotide type and the residual impurity type, if 
these match the template sequence, and so this fraction 
of the primer strands will then be out of phase with the 
remainder. This out of phase component produces an er- 
roneous incorporation signal which grows larger with each 
cycle and ultimately makes the sequence unreadable. 
1 ] A different direct sequencing approach uses dNTPs tagged 
at the 3' OH position with four different colored fluores- 
cent tags, one for each of the four nucleotides is de- 
scribed in Metzger, M.L., et al. (1994, Nucleic Acids Re- 
search 22:4259-4267). In this approach, the primer/ 
template duplex is contacted with all four dNTPs simulta- 
neously. Incorporation of a 3' tagged NMP blocks further 
chain extension. The excess and unreacted dNTPs are 
flushed away and the incorporated nucleotide is identified 
by the color of the incorporated fluorescent tag. The fluo- 
rescent tag must then be removed in order for a subse- 
quent incorporation reaction to occur. Similar to the py- 
rophosphate detection method, incomplete removal of a 
blocking fluorescent tag leaves some primer strands un- 



extended on the next reaction cycle, and if these are sub- 
sequently unblocked in a later cycle, once again an out- 
of-phase signal is produced which grows larger with each 
cycle and ultimately limits the read length. To date, this 
method has so far been demonstrated to work for only a 
single base extension. Thus, this method is slow and is 
likely to be restricted to very short read lengths due to the 
fact that 99% efficiency in removal of the tag is required to 
read beyond 50 base pairs. Incomplete removal of the la- 
bel results in out of phase extended DNA strands. 
Summary of Invention 

[0012] Accordingly, it is an object of the present invention to 
provide a novel method for determining the nucleotide 
sequence of a DNA fragment which eliminates the need 
for electrophoretic separation of DNA fragments. The in- 
ventive method, referred to herein as "reactive sequenc- 
ing", is based on detection of DNA polymerase catalyzed 
incorporation of each of the four nucleotide types, when 
deoxynucleoside triphosphates (dNTP's) are supplied indi- 
vidually and serially to a DNA primer/template system. 
The DNA primer/template system comprises a single 
stranded DNA fragment of unknown sequence, an 
oligonucleotide primer that forms a matched duplex with 



a short region of the single stranded DNA, and a DNA 
polymerase enzyme. The enzyme may either be already 
present in the template system, or may be supplied to- 
gether with the dNTP solution. 

[0013] Typically a single deoxynucleoside triphosphate (dNTP) is 
added to the DNA primer template system and allowed to 
react. As used herein deoxyribonucleotide means and in- 
cludes, in addition to dGTP, dCTP, dATP, dTTP, chemically 
modified versions of these deoxyribonucleotides or 
analogs thereof. Such chemically modified deoxyribonu- 
cleotides include but are not limited to those deoxyri- 
bonucleotides tagged with a fluorescent or chemilumines- 
cent moiety. Analogs of deoxyribonucleotides that may be 
used include but are not limited to 7-deazapurine. The 
present invention additionally provides a method for im- 
proving the purity of deoxynucleotides used in the poly- 
merase reaction. 

[0014] A n extension reaction will occur only when the incoming 
dNTP base is complementary to the next unpaired base of 
the DNA template beyond the 3 1 end of the primer. While 
the reaction is occurring, or after a delay of sufficient du- 
ration to allow a reaction to occur, the system is tested to 
determine whether an additional nucleotide derived from 



the added dNTP has been incorporated into the DNA 
primer/template system. A correlation between the dNTP 
added to the reaction cell and detection of an incorpora- 
tion signal identifies the nucleotide incorporated into the 
primer/template. The amplitude of the incorporation sig- 
nal identifies the number of nucleotides incorporated, and 
thereby quantifies single base repeat lengths where these 
occur. By repeating this process with each of the four nu- 
cleotides individually, the sequence of the template can be 
directly read in the 5' to 3' direction one nucleotide at a 
time. 

[0015] Detection of the polymerase mediated extension reaction 
and quantification of the extent of reaction can occur by a 
variety of different techniques, including but not limited 
to, microcalorimetic detection of the heat generated by 
the incorporation of a nucleotide into the extending du- 
plex. Optical detection of an extension reaction by fluo- 
rescence or chemiluminescence may also be used to de- 
tect incorporation of nucleotides tagged with fluorescent 
or chemiluminescent entities into the extending duplex. 
Where the incorporated nucleotide is tagged with a fluo- 
rophore, excess unincorporated nucleotide is removed, 
and the template system is illuminated to stimulate fluo- 



rescence from the incorporated nucleotide. The fluores- 
cent tag may then be cleaved and removed from the DNA 
template system before a subsequent incorporation cycle 
begins. A similar process is followed for chemilumines- 
cent tags, with the chemiluminescent reaction being stim- 
ulated by introducing an appropriate reagent into the sys- 
tem, again after excess unreacted tagged dNTP has been 
removed; however, chemiluminescent tags are typically 
destroyed in the process of readout and so a separate 
cleavage and removal step following detection may not be 
required. For either type of tag, fluorescent or chemilumi- 
nescent, the tag may also be cleaved after incorporation 
and transported to a separate detection chamber for fluo- 
rescent or chemiluminescent detection. In this way, fluo- 
rescent quenching by adjacent fluorophore tags incorpo- 
rated in a single base repeat sequence may be avoided. In 
addition, this may protect the DNA template system from 
possible radiation damage in the case of fluorescent de- 
tection or from possible chemical damage in the case of 
chemiluminescent detection. Alternatively the fluorescent 
tag may be selectively destroyed by a chemical or photo- 
chemical reaction. This process eliminates the need to 
cleave the tag after each readout, or to detach and trans- 



port the tag from the reaction chamber to a separate de- 
tection chamber for fluorescent detection. The present in- 
vention provides a method for selective destruction of a 
fluorescent tag by a photochemical reaction with 
diphenyliodonium ions or related species. 
[0016] The present invention further provides a reactive se- 
quencing method that utilizes a two cycle system. An ex- 
onuclease-deficient polymerase is used in the first cycle 
and a mixture of exonuclease-deficient and exonuclease- 
proficient enzymes are used in the second cycle. In the 
first cycle, the template-primer system together with an 
exonuclease-deficient polymerase will be presented se- 
quentially with each of the four possible nucleotides. In 
the second cycle, after identification of the correct nu- 
cleotide, a mixture of exonuclease proficient and deficient 
polymerases, or a polymerase containing both types of 
activity will be added in a second cycle together with the 
correct dNTP identified in the first cycle to complete and 
proofread the primer extension. In this way, an exonucle- 
ase-proficient polymerase is only present in the reaction 
cell when the correct dNTP is present, so that exonucle- 
olytic degradation of correctly extended strands does not 
occur, while degradation and correct re-extension of pre- 



viously incorrectly extended strands does occur, thus 
achieving extremely accurate strand extension. 
[0017] The present invention also provides a method for moni- 
toring reactive sequencing reactions to detect and correct 
sequencing reaction errors resulting from misincorpora- 
tion, i.e., incorrectly incorporating a non-complementary 
base, and extension failure, i.e., failure to extend a frac- 
tion of the DNA primer strands. The method is based on 
the ability to (i) determine the size of the trailing strand 
population (trailing strands are those primer strands 
which have undergone an extension failure at any exten- 
sion prior to the current reaction step); (ii) determine the 
downstream sequence of the trailing strand population 
between the 3' terminus of the trailing strands and the 3' 
terminus of the corresponding leading strands 
("downstream" refers to the template sequence beyond 
the current 3' terminus of a primer strand; correspond- 
ingly, "upstream" refers to the known template and com- 
plementary primer sequence towards the 5' end of the 
primer strand; "leading strands" are those primer strands 
which have not previously undergone extension failure); 
and (iii) predict at each extension step the signal to be 
expected from the extension of the trailing strands 



through simulation of the occurrence of an extension fail- 
ure at any point upstream from the 3' terminus of the 
leading strand. Subtraction of the predicted signal from 
the measured signal yields a signal due only to valid ex- 
tension of the leading strand population. 

[0018] | n a preferred embodiment of the invention, the monitor- 
ing for reactive sequencing reaction errors is computer- 
aided. The ability to monitor extension failures permits 
determination of the point to which the trailing strands for 
a given template sequence have advanced and the se- 
quence in the 1, 2 or 3 base gap between these strands 
and the leading strands. Knowing this information the 
dNTP probe cycle can be altered to selectively extend the 
trailing strands for a given template sequence while not 
extending the leading strands, thereby resynchronizing 
the populations. 

[0019] The present invention further provides an apparatus for 
DNA sequencing comprising: (a) at least one chamber in- 
cluding a DNA primer/template system which produces a 
detectable signal when a DNA polymerase enzyme incor- 
porates a deoxyribonucleotide monophosphate onto the 
3' end of the primer strand; (b) means for introducing 
into, and evacuating from, the reaction chamber at least 



one selected from the group consisting of buffers, elec- 
trolytes, DNA template, DNA primer, deoxyribonu- 
cleotides, and polymerase enzymes; (c) means for ampli- 
fying said signal; and (d) means for converting said signal 

into an electrical signal. 
Brief Description of Drawings 

[0020] Further objects and advantages of the invention will be 
apparent from a reading of the following description in 
conjunction with the accompanying drawings, in which: 

[0021] Figure 1 is a schematic diagram illustrating a reactive se- 
quencing device containing a thin film bismuth antimony 
thermopile in accordance with the invention; 

[0022] Figure 2 is a schematic diagram of a reactive sequencing 
device containing a thermistor in accordance with the in- 
vention; 

[0023] Figure 3 is a schematic diagram illustrating a representa- 
tive embodiment of micro calorimetry detection of a DNA 
polymerase reaction in accordance with the invention; 

[0024] Figure 4 is an electrophoretic gel showing a time course 
for primer extension assays catalyzed by T4 DNA poly- 
merase mutants; 

[0025] Figure 5 is a schematic diagram illustrating a nucleotide 
attached to a fluorophore by a benzoin ester which is a 



photocleavable linker for use in the invention; 

[0026] Figure 6 is a schematic illustration of a nucleotide at- 
tached to a chemiluminescent tag for use in the invention; 

[0027] Figure 7 is a schematic diagram of a nucleotide attached 
to a chemiluminescent tag by a cleavable linkage; 

[0028] Figures 8(a) and 8(b) are schematic diagrams of a me- 
chanical fluorescent sequencing method in accordance 
with the invention in which a DNA template and primer are 
absorbed on beads captured behind a porous frit; and 

[0029] Figure 9 is a schematic diagram of a sequencing method 
in accordance with the invention utilizing a two cycle sys- 
tem. 

[0030] Figure 10 is a diagram of the mechanism of photochemi- 
cal degradation of fluorescein by diphenyliodonium ion 
(DPI). 

[0031] Fig. 11 shows fluorescence spectra of equimolar concen- 
trations of fluorescein and tetramethylrhodamine dyes 
before and after addition of a solution of diphenyliodo- 
nium chloride. 

[0032] Figure 12 is the UV absorption spectra obtained from (1) 
fluorescein and (2) fluorescein + DPI after a single flash 
from a xenon camera strobe. 

[0033] Figure 13 displays the fluorescence spectra from single 



nucleotide polymerase reactions with DPI photobleaching 
between incorporation reactions. 

[0034] Figure 14A-D. Simulation of Reactive Sequencing of 

[CTGA] GAA ACC AGA AAG TCC [T], probed with a dNTP 
cycle. 14A. Sequence readout close to the primer where 
no extension failure has occurred. 14B. Sequence readout 
downstream of primer where 60% of the strands have un- 
dergone extension failure and are producing out of phase 
signals and misincorporation has prevented extension on 
75% of all strands. 14C. Downstream readout with error 
signals from trailing strands (dark shading) distinguished 
from correct readout signals from leading strands (light 
shading) using knowledge of the downstream sequence of 
the trailing strands. 14D. Corrected sequence readout fol- 
lowing subtraction of error signals from trailing strands. 
Note the similarity to the data of Fig. 1A. 

[0035] Figure 15. Effect of a leading strand population on exten- 
sion signals. 
Detailed Description 

[0036] The present invention provides a method for determining 
the nucleic acid sequence of a DNA molecule based on 
detection of successive single nucleotide DNA polymerase 
mediated extension reactions. As described in detail be- 



low, in one embodiment, a DNA primer/template system 
comprising a polynucleotide primer complementary to and 
bound to a region of the DNA to be sequenced is con- 
strained within a reaction cell into which buffer solutions 
containing various reagents necessary for a DNA poly- 
merase reaction to occur are added. Into the reaction cell, 
a single type of deoxynucleoside triphosphate (dNTP) is 
added. Depending on the identity of the next complemen- 
tary site in the DNA primer/template system, an extension 
reaction will occur only when the appropriate nucleotide is 
present in the reaction cell. A correlation between the nu- 
cleotide present in the reaction cell and detection of an 
incorporation signal identifies the next nucleotide of the 
template. Following each extension reaction, the reaction 
cell is flushed with dNTP-free buffer, retaining the DNA 
primer/template system, and the cycle is repeated until 
the entire nucleotide sequence is identified. 
[0037] The present invention is based on the existence of a con- 
trol signal within the active site of DNA polymerases which 
distinguish, with high fidelity, complementary and non- 
complementary fits of incoming deoxynucleotide triphos- 
phates to the base on the template strand at the primer 
extension site, i.e., to read the sequence, and to incorpo- 



rate at that site only the one type of deoxynucleotide that 
is complementary. That is, if the available nucleotide type 
is not complementary to the next template site, the poly- 
merase is inactive, thus, the template sequence is the 
DNA polymerase control signal. Therefore, by contacting a 
DNA polymerase system with a single nucleotide type 
rather than all four, the next base in the sequence can be 
identified by detecting whether or not a reaction occurs. 
Further, single base repeat lengths can be quantified by 
quantifying the extent of reaction. 
[0038] As a first step in the practice of the inventive method, sin- 
gle-stranded template DNA to be sequenced is prepared 
using any of a variety of different methods known in the 
art. Two types of DNA can be used as templates in the se- 
quencing reactions. Pure single-stranded DNA such as 
that obtained from recombinant bacteriophage can be 
used. The use of bacteriophage provides a method for 
producing large quantities of pure single stranded tem- 
plate. Alternatively, single-stranded DNA may be derived 
from double-stranded DNA that has been denatured by 
heat or alkaline conditions, as described in Chen and Sub- 
rung, (1985, DNA 4:165); Huttoi and Skaki (1986, Anal. 
Biochem. 152:232); and Mierendorf and Pfeffer, (1987, 



Methods Enzymol. 152:556), may be used. Such double 
stranded DNA includes, for example, DNA samples de- 
rived from patients to be used in diagnostic sequencing 
reactions. 

[0039] The template DNA can be prepared by various techniques 
well known to those of skill in the art. For example, tem- 
plate DNA can be prepared as vector inserts using any 
conventional cloning methods, including those used fre- 
quently for sequencing. Such methods can be found in 
Sambrook et al., Molecular Cloning: A Laboratory Manual, 
Second Edition (Cold Spring Harbor Laboratories, New 
York, 1989). In a preferred embodiment of the invention, 
polymerase chain reactions (PCR) may be used to amplify 
fragments of DNA to be used as template DNA as de- 
scribed in Innis et al., ed. PCR Protocols (Academic Press, 
New York, 1990). 

[0040] The amount of DNA template needed for accurate detec- 
tion of the polymerase reaction will depend on the detec- 
tion technique used. For example, for optical detection, 
e.g., fluorescence or chemiluminescence detection, rela- 
tively small quantities of DNA in the femtomole range are 
needed. For thermal detection quantities approaching one 
picomole may be required to detect the change in tern- 



perature resulting from a DNA polymerase mediated ex- 
tension reaction. 
[0041] | n enzymatic sequencing reactions, the priming of DNA 
synthesis is achieved by the use of an oligonucleotide 
primer with a base sequence that is complementary to, 
and therefore capable of binding to, a specific region on 
the template DNA sequence. In instances where the tem- 
plate DNA is obtained as single stranded DNA from bacte- 
riophage, or as double stranded DNA derived from plas- 
mids, "universal" primers that are complementary to se- 
quences in the vectors, i.e., the bacteriophage, cosmid and 
plasmid vectors, and that flank the template DNA, can be 
used. 

[0042] Primer oligonucleotides are chosen to form highly stable 
duplexes that bind to the template DNA sequences and 
remain intact during any washing steps during the exten- 
sion cycles. Preferably, the length of the primer oligonu- 
cleotide is from 18-30 nucleotides and contains a bal- 
anced base composition. The structure of the primer 
should also be analyzed to confirm that it does not con- 
tain regions of dyad symmetry which can fold and self an- 
neal to form secondary structures thereby rendering the 
primers inefficient. Conditions for selecting appropriate 



hybridization conditions for binding of the oligonucleotide 
primers in the template systems will depend on the primer 
sequence and are well known to those of skill in the art. 
[0043] | n utilizing the reactive sequencing method of the inven- 
tion, a variety of different DNA polymerases may be used 
to incorporate dNTPs onto the 3' end of the primer which 
is hybridized to the template DNA molecule. Such DNA 
polymerases include but are not limited to Taq poly- 
merase, 17 or T4 polymerase, and Klenow polymerase. In 
a preferred embodiment of the invention, described in de- 
tail below, DNA polymerases lacking 5'-3'-exonuclease 
proofreading activity are used in the sequencing reac- 
tions. For the most rapid reaction kinetics, the amount of 
polymerase is sufficient to ensure that each DNA molecule 
carries a non-covalently attached polymerase molecule 
during reaction. For a typical equilibrium constant of -50 
nM for the dissociation equilibrium: 

DNA-Pol ^ DNA + Pol K~50nM 
the desired condition is: [Pol] > 50nM + [DNA]. 

[0044] | n addition, reverse transcriptase which catalyzes the syn- 



thesis of single stranded DNA from an RNA template may 
be utilized in the reactive sequencing method of the in- 
vention to sequence messenger RNA (mRNA). Such a 
method comprises sequentially contacting an RNA tem- 
plate annealed to a primer (RNA primer/template) with 
dNTPs in the presence of reverse transcriptase enzyme to 
determine the sequence of the RNA. Because mRNA is 
produced by RNA polymerase-catalyzed synthesis from a 
DNA template, and thus contains the sequence informa- 
tion of the DNA template strand, sequencing the mRNA 
yields the sequence of the DNA gene from which it was 
transcribed. Eukaryotic mRNAs have poly(A) tails and 
therefore the primer for reverse transcription can be an 
oligo(dT). Typically, it will be most convenient to synthe- 
size the oligo(dT) primer with a terminal biotin or amino 
group through which the primer can be captured on a 
substrate and subsequently hybridize to and capture the 
template mRNA strand. 
[0045] The extension reactions are carried out in buffer solutions 
which contain the appropriate concentrations of salts, 
dNTPs and DNA polymerase required for the DNA poly- 
merase mediated extension to proceed. For guidance re- 
garding such conditions see, for example, Sambrook, et 



al., (1989, Molecular Cloning, A Laboratory Manual, Cold 
Spring Harbor Press, N.Y.); and Ausubel, et al. (1989, Cur- 
rent Protocols in Molecular Biology, Green Publishing As- 
sociates and Wiley Interscience, N.Y. ). 

[0046] Typically, buffer containing one of the four dNTPs is 

added into a reaction cell. Depending on the identity of 
the nucleoside base at the next unpaired template site in 
the primer/template system, a reaction will occur when 
the reaction cell contains the appropriate dNTP. When the 
reaction cell contains any one of the other three incorrect 
dNTPs, no reaction will take place. 

[0047] The reaction cell is then flushed with dNTP free buffer and 
the cycle is repeated until a complete DNA sequence is 
identified. Detection of a DNA polymerase mediated ex- 
tension can be made using any of the detection methods 
described in detail below including optical and thermal 
detection of an extension reaction. 

[0048] | n some instances, a nucleotide solution is found to be 
contaminated with any of the other three nucleotides. In 
such instances a small fraction of strands may be ex- 
tended by incorporation of an impurity dNTP when the 
dNTP type supplied is incorrect for extension, producing a 
population of strands which are subsequently extended 



ahead of the main strand population. Thus, in an embodi- 
ment of the invention, each nucleotide solution can be 
treated to remove any contaminated nucleotides. Treat- 
ment of each nucleotide solution involves reaction of the 
solution prior to use with immobilized DNA complemen- 
tary to each the possibly contaminating nucleotides. For 
example, a dATP solution will be allowed to react with im- 
mobilized poly (dA), poly (dG) or poly (dC), with appropri- 
ate primers and polymerase, for a time sufficient to incor- 
porate any contaminating dTTP, dCTP and dGTP nu- 
cleotides into DNA. 
[0049] | n a preferred embodiment of the invention, the primer/ 
template system comprises the template DNA tethered to 
a solid phase support to permit the sequential addition of 
sequencing reaction reagents without complicated and 
time consuming purification steps following each exten- 
sion reaction. Preferably, the template DNA is covalently 
attached to a solid phase support, such as the surface of a 
reaction flow cell, a polymeric microsphere, filter material, 
or the like, which permits the sequential application of se- 
quencing reaction reagents, i.e., buffers, dNTPs and DNA 
polymerase, without complicated and time consuming pu- 
rification steps following each extension reaction. Alter- 



natively, for applications that require sequencing of many 
samples containing the same vector template or same 
gene, for example, in diagnostic applications, a universal 
primer may be tethered to a support, and the template 
DNA allowed to hybridize to the immobilized primer. 

[0050] The DNA may be modified to facilitate covalent or non- 
covalent tethering of the DNA to a solid phase support. 
For example, when PCR is used to amplify DNA fragments, 
the 5' ends of one set of PCR primer oligonucleotides 
strands may be modified to carry a linker moiety for teth- 
ering one of the two complementary types of DNA strands 
produced to a solid phase support. Such linker moieties 
include, for example, biotin. When using biotin, the bi- 
otinylated DNA fragments may be bound non-covalently 
to streptavidin covalently attached to the solid phase sup- 
port. Alternatively, an amino group (-NH 2 ) may be chemi- 
cally incorporated into one of the PCR primer strands and 
used to covalently link the DNA template to a solid phase 
support using standard chemistry, such as reactions with 
N-hydroxysuccinimide activated agarose surfaces. 

[0051] | n another embodiment, the 5' ends of the sequencing 
oligonucleotide primer may be modified with biotin, for 
non-covalent capture to a streptavidin-treated support, or 



with an amino group for chemical linkage to a solid sup- 
port; the template strands are then captured by the non- 
covalent binding attraction between the immobilized 
primer base sequence and the complementary sequence 
on the template strands. Methods for immobilizing DNA 
on a solid phase support are well known to those of skill 
in the art and will vary depending on the solid phase sup- 
port chosen. 

[0052] | n the reactive sequencing method of the present inven- 
tion, DNA polymerase is presented sequentially with each 
of the 4 dNTPs. In the majority of the reaction cycles, only 
incorrect dNTPs will be present, thereby increasing the 
likelihood of misincorporation of incorrect nucleotides 
into the extending DNA primer/ template system. 

[0053] Accordingly, the present invention further provides meth- 
ods for optimizing the reactive sequencing reaction to 
achieve rapid and complete incorporation of the correct 
nucleotide into the DNA primer/template system, while 
limiting the misincorporation of incorrect nucleotides. For 
example, dNTP concentrations may be lowered to reduce 
misincorporation of incorrect nucleotides into the DNA 
primer. K values for incorrect dNTPs can be as much as 

m 

1000-fold higher than for correct nucleotides, indicating 



that a reduction in dNTP concentrations can reduce the 
rate of misincorporation of nucleotides. Thus, in a pre- 
ferred embodiment of the invention the concentration of 
dNTPs in the sequencing reactions are approximately 5 - 
20 uM. At this concentration, incorporation rates are as 
close to the maximum rate of 400 nucleotides/s forT4 
DNA polymerase as possible. 

[0054] | n addition, relatively short reaction times can be used to 
reduce the probability of misincorporation. For an incor- 
poration rate approaching the maximum rate of ~ 400 
nucleotides/s, a reaction time of approximately 25 mil- 
liseconds (ms) will be sufficient to ensure extension of 
99.99% of primer strands. 

[0055] | n a specific embodiment of the invention, DNA 

polymerases lacking 3' to 5' exonuclease activity may be 
used for reactive sequencing to limit exonucleolytic 
degradation of primers that would occur in the absence of 
correct dNTPs. In the presence of all four dNTPs, misin- 
corporation frequencies by DNA polymerases possessing 
exonucleolytic proofreading activity are as low as one er- 
ror in 106 to 108 nucleotides incorporated as discussed in 
Echols and Goodman (1991, Annu. Rev. Biochem 
60;477-511); and Goodman, et al. (1993, Crit. Rev. 



Biochem. Molec. Biol. 28:83-126); and Loeb and Kunkel 
(1982, Annu. Rev. Biochem. 52:429-457). In the absence 
of proofreading, DNA polymerase error rates are typically 

4 6 

on the order of 1 in 10 to 1 in 10 . Although exonuclease 
activity increases the fidelity of a DNA polymerase, the use 
of DNA polymerases having proofreading activity can pose 
technical difficulties for the reactive sequencing method 
of the present invention. Not only will the exonuclease re- 
move any misincorporated nucleotides, but also, in the 
absence of a correct dNTP complementary to the next 
template base, the exonuclease will remove correctly- 
paired nucleotides successively until a point on the tem- 
plate sequence is reached where the base is complemen- 
tary to the dNTP in the reaction cell. At this point, an 
idling reaction is established where the polymerase re- 
peatedly incorporates the correct dNMP and then removes 
it. Only when a correct dNTP is present will the rate of 
polymerase activity exceed the exonuclease rate so that 
an idling reaction is established that maintains the incor- 
poration of that correct nucleotide at the 3' end of the 
primer. 

[0056] a number of T4 DNA polymerase mutants containing spe- 
cific amino acid substitutions possess reduced exonucle- 



ase activity levels up to 10,000-fold less than the wild- 
type enzyme. For example, Reha-Krantz and Nonay (1993, 
J. Biol. Chem. 268:27100-17108) report that when Asp 
112 was replaced with Ala and Glu 114 was replaced with 
Ala (D112A/E114A) in T4 polymerase, these two amino 
acid substitutions reduced the exonuclease activity on 
double stranded DNA by a factor of about 300 relative to 
the wild type enzyme. Such mutants may be advanta- 
geously used in the practice of the invention for incorpo- 
ration of nucleotides into the DNA primer/template sys- 
tem. 

[0057] | n y e t another embodiment of the invention, DNA poly- 
merases which are more accurate than wild type poly- 
merases at incorporating the correct nucleotide into a 
DNA primer/template may be used. For example, in a 
(D112A/E114A) mutant T4 polymerase with a third muta- 
tion where lie 417 is replaced by Val 
(I417V/D112A/E114A), the 1417V mutation results in an 
antimutator phenotype for the polymerase (Reha-Krantz 
and Nonay, 1994, J. Biol. Chem. 269:5635-5643; Stocki et 
al., 1995, Mol. Biol. 254:15-28). This antimutator pheno- 
type arises because the polymerase tends to move the 
primer ends from the polymerase site to the exonuclease 



site more frequently and thus proof read more frequently 
than the wild type polymerase, and thus increases the ac- 
curacy of synthesis. 

[0058] | n y e t another embodiment of the invention, polymerase 
mutants that are capable of more efficiently incorporating 
fluorescent-labeled nucleotides into the template DNA 
system molecule may be used in the practice of the inven- 
tion. The efficiency of incorporation of fluorescent-la- 
beled nucleotides may be reduced due to the presence of 
bulky fluorophore labels that may inhibit dNTP interaction 
at the active site of the polymerase. Polymerase mutants 
that may be advantageously used for incorporation of flu- 
orescent-labeled dNTPs into DNA include but are not lim- 
ited to those described in U.S. Application Serial No. 
08/632,742 filed April 16, 1996 which is incorporated by 
reference herein. 

[0059] | n a preferred embodiment of the invention, the reactive 
sequencing method utilizes a two cycle system. An ex- 
onuclease-deficient polymerase is used in the first cycle 
and a mixture of exonuclease-deficient and exonuclease- 
proficient enzymes are used in the second cycle. In the 
first cycle, the primer/template system together with an 
exonuclease-deficient polymerase will be presented se- 



quentially with each of the four possible nucleotides. Re- 
action time and conditions will be such that a sufficient 
fraction of primers are extended to allow for detection 
and quantification of nucleotide incorporation, ~ 98%, for 
accurate quantification of multiple single-base repeats. In 
the second cycle, after identification of the correct nu- 
cleotide, a mixture of exonuclease proficient and deficient 
polymerases, or a polymerase containing both types of 
activity will be added in a second cycle together with the 
correct dNTP identified in the first cycle to complete and 
proofread the primer extension. In this way, an exonucle- 
ase-proficient polymerase is only present in the reaction 
cell when the correct dNTP is present, so that exonucle- 
olytic degradation of correctly extended strands does not 
occur, while degradation and correct re-extension of pre- 
viously incorrectly extended strands does occur, thus 
achieving extremely accurate strand extension. 
[0060] The detection of a DNA polymerase mediated extension 
reaction can be accomplished in a number of ways. For 
example, the heat generated by the extension reaction 
can be measured using a variety of different techniques 
such as those employing thermopile, thermistor and re- 
fractive index measurements. 



[0061] | n an embodiment of the invention, the heat generated by 
a DNA polymerase mediated extension reaction can be 
measured. For example, in a reaction cell volume of 100 

3 

micrometers containing 1 ug of water as the sole thermal 
mass and 2xlO U DNA template molecules (300 fmol) teth- 
ered within the cell, the temperature of the water in- 

3 

creases by 1x10 °C for a polymerase reaction which ex- 
tends the primer by a single nucleoside monophosphate. 
This calculation is based on the experimental determina- 
tion that a one base pair extension in a DNA chain is an 
exothermic reaction and the enthalpy change associated 
with this reaction is 3.5 kcal/mole of base. Thus exten- 
sion of 300 fmol of primer strands by a single base pro- 

-9 

duces 300 fmol x 3.5 kcal/mol or 1 x 10 cal of heat. 
This is sufficient to raise the temperature of 1 ug of water 

_3 

by 1x10 °C. Such a temperature change can be readily 
detectable using thermistors (sensitivity < I0°C); ther- 
mopiles (sensitivity < 10~ 5o C); and refractive index mea- 
surements (sensitivity < 10°C). 
[0062] | n a specific embodiment of the invention, thermopiles 
may be used to detect temperature changes. Such ther- 
mopiles are known to have a high sensitivity to tempera- 
ture and can make measurements in the tens of micro- 



degree range in several second time constants. Ther- 
mopiles may be fabricated by constructing serial sets of 
junctions of two dissimilar metals and physically arrang- 
ing the junctions so that alternating junctions are sepa- 
rated in space. One set of junctions is maintained at a 
constant reference temperature, while the alternate set of 
junctions is located in the region whose temperature is to 
be sensed. A temperature difference between the two sets 
of junctions produces a potential difference across the 
junction set which is proportional to the temperature dif- 
ference, to the thermoelectric coefficient of the junction 
and to the number of junctions. For optimum response, 
bimetallic pairs with a large thermoelectric coefficient are 
desirable, such as bismuth and antimony. Thermopiles 
may be fabricated using thin film deposition techniques in 
which evaporated metal vapor is deposited onto insulating 
substrates through specially fabricated masks. Ther- 
mopiles that may be used in the practice of the invention 
include thermopiles such as those described in U.S. Patent 
4,935,345, which is incorporated by reference herein. 
[0063] | n a specific embodiment of the invention, miniature thin 
film thermopiles produced by metal evaporation tech- 
niques, such as those described in U.S. Patent 4,935,345 



incorporated herein by reference, may be used to detect 
the enthalpy changes. Such devices have been made by 
vacuum evaporation through masks of about 10 mm 
square. Using methods of photolithography, sputter etch- 
ing and reverse lift-off techniques, devices as small as 2 
mm square may be constructed without the aid of modern 
microlithographic techniques. These devices contain 150 
thermoelectric junctions and employ 12 micron line 
widths and can measure the exothermic heat of reaction 
of enzyme-catalyzed reactions in flow streams where the 
enzyme is preferably immobilized on the surface of the 
thermopile. 

[0064] jo incorporate thermopile detection technology into a re- 
active sequencing device, thin-film bismuth-antimony 
thermopiles 2, as shown in Figure 1, may be fabricated by 
successive electron-beam evaporation of bismuth and an- 
timony metals through two different photolithographi- 
cally-generated masks in order to produce a zigzag array 
of alternating thin bismuth and antimony wires which are 
connected to form two sets of bismuth-antimony thermo- 
couple junctions. Modern microlithographic techniques 
will allow fabrication of devices at least one order of mag- 
nitude smaller than those previously made, i.e., with line 



widths as small as lym and overall dimensions on the or- 

2 

der of 100 pirn . One set of junctions 4 (the sensor junc- 
tions) is located within the reaction cell 6, i.e., deposited 
on a wall of the reaction cell, while the second reference 
set of junctions 8 is located outside the cell at a reference 
point whose temperature is kept constant. Any difference 
in temperature between the sensor junctions and the ref- 
erence junctions results in an electric potential being gen- 
erated across the device, which can be measured by a 
high-resolution digital voltmeter 10 connected to mea- 
surement points 12 at either end of the device. It is not 
necessary that the temperature of the reaction cell and the 
reference junctions be the same in the absence of a poly- 
merase reaction event, only that a change in the tempera- 
ture of the sensor junctions due to a polymerase reaction 
event be detectable as a change in the voltage generated 
across the thermopile. 
[0065] | n addition to thermopiles, as shown in Figure 2, a ther- 
mistor 14 may also be used to detect temperature 
changes in the reaction cell 6 resulting from DNA poly- 
merase mediated incorporation of dNMPs into the DNA 
primer strand. Thermistors are semiconductors composed 
of a sintered mixture of metallic oxides such as man- 



ganese, nickel, and cobalt oxides. This material has a 
large temperature coefficient of resistance, typically ~ 4% 
per °C, and so can sense extremely small temperature 
changes when the resistance is monitored with a stable, 
high-resolution resistance-measuring device such as a 
digital voltmeter, e.g., Keithley Instruments Model 2002. A 
thermistor 14, such as that depicted in Figure 2, may be 
fabricated in the reactive sequencing reaction cell by 
sputter depositing a thin film of the active thermistor ma- 
terial onto the surface of the reaction cell from a single 
target consisting of hot pressed nickel, cobalt and man- 
ganese oxides. Metal interconnections 16 which extend 
out beyond the wall of the reaction cell may also be fabri- 
cated in a separate step so that the resistance of the ther- 
mistor may be measured using an external measuring de- 
vice 18. 

[0066] Temperature changes may also be sensed using a refrac- 
tive index measurement technique. For example, tech- 
niques such as those described in Bornhop (1995, Applied 
Optics 34:3234-323) and U.S. Patent 5,325,170, may be 
used to detect refractive index changes for liquids in cap- 
illaries. In such a technique, a low-power He-Ne laser is 
aimed off-center at a right angle to a capillary and under- 



goes multiple internal reflection. Part of the beam travels 
through the liquid while the remainder reflects only off 
the external capillary wall. The two beams undergo differ- 
ent phase shifts depending on the refractive index differ- 
ence between the liquid and capillary. The result is an in- 
terference pattern, with the fringe position extremely sen- 
sitive to temperature - induced refractive index changes. 
[0067] | n a further embodiment of the invention, the thermal re- 
sponse of the system may be increased by the presence of 
inorganic pyrophosphatase enzyme which is contacted 
with the template system along with the dNTP solution. 
Additionally, heat is released as the pyrophosphate re- 
leased from the dNTPs upon incorporation into the tem- 
plate system is hydrolyzed by inorganic pyrophosphatase 
enzyme. 

[0068] | n another embodiment, the pyrophosphate released upon 
incorporation of dNTP's may be removed from the tem- 
plate system and hydrolyzed, and the resultant heat de- 
tected, using thermopile, thermistor or refractive index 
methods, in a separate reaction cell downstream. In this 
reaction cell, inorganic pyrophosphatase enzyme may be 
mixed in solution with the dNTP removed from the DNA 
template system, or alternatively the inorganic pyrophos- 



phatase enzyme may be covalently tethered to the wall of 
the reaction cell. 

[0069] Alternatively, the polymerase-catalyzed incorporation of a 
nucleotide base can be detected using fluorescence and 
chemiluminescence detection schemes. The DNA poly- 
merase mediated extension is detected when a fluores- 
cent or chemiluminescent signal is generated upon incor- 
poration of a fluorescently or chemiluminescently labeled 
dNMP into the extending DNA primer strand. Such tags 
are attached to the nucleotide in such a way as to not in- 
terfere with the action of the polymerase. For example, 
the tag may be attached to the nucleotide base by a linker 
arm sufficiently long to move the bulky fluorophore away 
from the active site of the enzyme. 

[0070] For use of such detection schemes, nucleotide bases are 
labeled by covalently attaching a compound such that a 
fluorescent or chemiluminescent signal is generated fol- 
lowing incorporation of a dNTP into the extending DNA 
primer/template. Examples of fluorescent compounds for 
labeling dNTPs include but are not limited to fluorescein, 
rhodamine, and BODIPY 

(4,4-difluoro-4-bora-3a,4a-diaza-s-indacene). See 
"Handbook of Molecular Probes and Fluorescent Chemi- 



cals", available from Molecular Probes, Inc. (Eugene, OR). 
Examples of chemiluminescence based compounds that 
may be used in the sequencing methods of the invention 
include but are not limited to luminol and dioxetanones 
(See, Gundennan and McCapra, "Chemiluminescence in 
Organic Chemistry", Springer-Verlag, Berlin Heidleberg, 
1987) 

[0071] Fluorescently or chemiluminescently labeled dNTPs are 
added individually to a DNA template system containing 
template DNA annealed to the primer, DNA polymerase 
and the appropriate buffer conditions. After the reaction 
interval, the excess dNTP is removed and the system is 
probed to detect whether a fluorescent or chemilumines- 
cent tagged nucleotide has been incorporated into the 
DNA template. Detection of the incorporated nucleotide 
can be accomplished using different methods that will de- 
pend on the type of tag utilized. 

[0072] For fluorescently-tagged dNTPs the DNA template system 
may be illuminated with optical radiation at a wavelength 
which is strongly absorbed by the tag entity. Fluorescence 
from the tag is detected using for example a photodetec- 
tor together with an optical filter which excludes any scat- 
tered light at the excitation wavelength. 



[0073] since labels on previously incorporated nucleotides would 
interfere with the signal generated by the most recently 
incorporated nucleotide, it is essential that the fluorescent 
tag be removed at the completion of each extension reac- 
tion. To facilitate removal of a fluorescent tag, the tag 
may be attached to the nucleotide via a chemically or 
photochemically cleavable linker using methods such as 
those described by Metzger, M.L., et al. (1994, Nucleic 
Acids Research 22:4259-4267) and Burgess, K., et al., 
(1997, J. Org. Chem. 62:5165-5168) so that the fluores- 
cent tag may be removed from the DNA template system 
before a new extension reaction is carried out. 

[0074] | n a further embodiment utilizing fluorescent detection, 
the fluorescent tag is attached to the dNTP by a photo- 
cleavable or chemically cleavable linker, and the tag is de- 
tached following the extension reaction and removed from 
the template system into a detection cell where the pres- 
ence, and the amount, of the tag is determined by optical 
excitation at a suitable wavelength and detection of fluo- 
rescence. In this embodiment, the possibility of fluores- 
cence quenching, due to the presence of multiple fluores- 
cent tags immediately adjacent to one another on a 
primer strand which has been extended complementary to 



a single base repeat region in the template, is minimized, 
and the accuracy with which the repeat number can be 
determined is optimized. In addition, excitation of fluo- 
rescence in a separate chamber minimizes the possibility 
of photolytic damage to the DNA primer/template system. 
[0075] | n an additional embodiment utilizing fluorescent detec- 
tion, the signal from the fluorescent tag can be destroyed 
using a chemical reaction which specifically targets the 
fluorescent moiety and reacts to form a final product 
which is no longer fluorescent. In this embodiment, the 
fluorescent tag attached to the nucleotide base is de- 
stroyed following extension and detection of the fluores- 
cence signal, without the removal of the tag. In a specific 
embodiment, fluorophores attached to dNTP bases may 
be selectively destroyed by reaction with compounds ca- 
pable of extracting an electron from the excited state of 
the fluorescent moiety thereby producing a radical ion of 
the fluorescent moiety which then reacts to form a final 
product which is no longer fluorescent. In a further spe- 
cific embodiment, the signal from a fluorescent tag is de- 
stroyed by photochemical reaction with the cation of a 
diphenyliodonium salt following extension and detection 
of the fluorescence label. The fluorescent tag attached to 



the incorporated nucleotide base is destroyed, without re- 
moval of the tag, by the addition of a solution of a 
diphenyliodonium salt to the reaction cell and subsequent 
UV light exposure. The diphenyliodonium salt solution is 
removed and the reactive sequencing is continued. This 
embodiment does not require dNTP's with chemically or 
photochemically cleavable linkers, since the fluorescent 
tag need not be removed. 

[0076] | n a further embodiment of the technique, the response 
generated by a DNA polymerase-mediated extension re- 
action can be amplified. In this embodiment, the dNTP is 
chemically modified by the covalent attachment of a sig- 
naling tag through a linker that can be cleaved either 
chemically or photolytically. Following exposure of the 
dNTP to the primer/template system and flushing away 
any unincorporated chemically modified dNTP, any signal- 
ing tag that has been incorporated is detached by a 
chemical or photolytic reaction and flushed out of the re- 
action chamber to an amplification chamber in which an 
amplified signal may be produced and detected. 

[0077] a variety of methods may be used to produce an amplified 
signal. In one such method the signaling tag has a cat- 
alytic function. When the catalytic tag is cleaved and al- 



lowed to react with its substrate, many cycles of chemical 
reaction ensue producing many moles of product per 
mole of catalytic tag, with a corresponding multiplication 
of reaction enthalpy. Either the reaction product is de- 
tected, through some property such as color or ab- 
sorbency, or the amplified heat product is detected by a 
thermal sensor. For example, if an enzyme is covalently 
attached to the dNTP via a cleavable linker arm of suffi- 
cient length that the enzyme does not interfere with the 
active site of the polymerase enzyme. Following incorpo- 
ration onto the DNA primer strand, that enzyme is de- 
tached and transported to a second reactor volume in 
which it is allowed to interact with its specific substrate, 
thus an amplified response is obtained as each enzyme 
molecule carries out many cycles of reaction. For example, 
the enzyme catalase (CAT) catalyzes the reaction: 

CAT 

H 2 0 2 — > H 2 0 + V2O2 + ~100kJ/mol Heat 



[0078] if eac h dNTP is tagged with a catalase molecule which is 
detached after dNMP incorporation and allowed to react 



downstream with hydrogen peroxide, each nucleotide in- 
corporation would generate ~ 25 kcal/mol x N of heat 
where N is the number of hydrogen peroxide molecules 
decomposed by the catalase. The heat of decomposition 
of hydrogen peroxide is already ~ 6-8 times greater than 
for nucleotide incorporation, (i.e. 3.5 - 4 kcal/mol). For 
decomposition of- 150 hydrogen peroxide molecules the 
amount of heat generated per base incorporation ap- 
proaches 1000 times that of the unamplified reaction. 
Similarly, enzymes which produce colored products, such 
as those commonly used in enzyme-linked immunosor- 
bent assays (ELISA) could be incorporated as detachable 
tags. For example the enzyme alkaline phosphatase con- 
verts colorless p-nitrophenyl phosphate to a colored 
product (p-nitrophenol); the enzyme horseradish peroxi- 
dase converts colorless o-phenylenediamine hydrochlo- 
ride to an orange product. Chemistries for linking these 
enzymes to proteins such as antibodies are well-known to 
those versed in the art, and could be adapted to link the 
enzymes to nucleotide bases via linker arms that maintain 
the enzymes at a distance from the active site of the poly- 
merase enzymes. 
[0079] | n a further embodiment, an amplified thermal signal may 



be produced when the signaling tag is an entity which can 
stimulate an active response in cells which are attached 
to, or held in the vicinity of, a thermal sensor such as a 
thermopile or thermistor. Pizziconi and Page (1997, 
Biosensors and Bioelectronics 12:457-466) reported that 
harvested and cultured mast cell populations could be ac- 
tivated by calcium ionophore to undergo exocytosis to re- 
lease histamine, up to 10 - 30 pg (100 - 300 fmol) per 
cell. The multiple cell reactions leading to exocytosis are 
themselves exothermic. This process is further amplified 
using the enzymes diamine oxidase to oxidize the his- 
tamine to hydrogen peroxide and imidazoleacetaldehyde, 
and catalase to disproportionate the hydrogen peroxide. 
Two reactions together liberate over 100 kj of heat per 
mole of histamine. For example, a calcium ionophore is 
covalently attached to the dNTP base via a linker arm 
which distances the linked calcium ionophore from the 
active site of the polymerase enzyme and is chemically or 
photochemically cleavable. Following the DNA polymerase 
catalyzed incorporation step, and flushing away unincor- 
porated nucleotides any calcium ionophore remaining 
bound to an incorporated nucleotide may be cleaved and 
flushed downstream to a detection chamber containing a 



mast cell-based sensor such as described by Pizziconi and 
Page (1997, Biosensors and Bioelectronics 12:457-466). 
The calcium ionophore would bind to receptors on the 
mast cells stimulating histamine release with the accom- 
panying generation of heat. The heat production could be 
further amplified by introducing the enzymes diamine ox- 
idase to oxidize the histamine to hydrogen peroxide and 
imidazoleacetaldehyde, and catalase to disproportionate 
the hydrogen peroxide. Thus a significantly amplified heat 
signal would be produced which could readily be detected 
by a thermopile or thermistor sensor within, or in contact 
with, the reaction chamber. 
[0080] | n a further embodiment utilizing chemiluminescent de- 
tection, the chemiluminescent tag is attached to the dNTP 
by a photocleavable or chemically cleavable linker. The tag 
is detached following the extension reaction and removed 
from the template system into a detection cell where the 
presence, and the amount, of the tag is determined by an 
appropriate chemical reaction and sensitive optical detec- 
tion of the light produced. In this embodiment, the possi- 
bility of a non-linear optical response due to the presence 
of multiple chemiluminescent tags immediately adjacent 
to one another on a primer strand which has been ex- 



tended complementary to a single base repeat region in 
the template, is minimized, and the accuracy with which 
the repeat number can be determined is optimized. In ad- 
dition, generation of chemiluminescence in a separate 
chamber minimizes chemical damage to the DNA primer/ 
template system, and allows detection under harsh chem- 
ical conditions which otherwise would chemically damage 
the DNA primer/template. In this way, chemiluminescent 
tags can be chosen to optimize chemiluminescence reac- 
tion speed, or compatibility of the tagged dNTP with the 
polymerase enzyme, without regard to the compatibility of 
the chemiluminescence reaction conditions with the DNA 
primer/template. 
[0081] | n a further embodiment of the invention, the concentra- 
tion of the dNTP solution removed from the template sys- 
tem following each extension reaction can be measured 
by detecting a change in UV absorption due to a change in 
the concentration of dNTPs, or a change in fluorescence 
response of fluorescently-tagged dNTPs. The incorpora- 
tion of nucleotides into the extended template would re- 
sult in a decreased concentration of nucleotides removed 
from the template system. Such a change could be de- 
tected by measuring the UV absorption of the buffer re- 



moved from the template system following each extension 
cycle. 

[0082] | n a further embodiment of the invention, extension of the 
primer strand may be sensed by a device capable of sens- 
ing fluorescence from, or resolving an image of, a single 
DNA molecule. Devices capable of sensing fluorescence 
from a single molecule include the confocal microscope 
and the near-field optical microscope. Devices capable of 
resolving an image of a single molecule include the scan- 
ning tunneling microscope (STM) and the atomic force mi- 
croscope (AFM). 

[0083] | n this embodiment of the invention, a single DNA tem- 
plate molecule with attached primer is immobilized on a 
surface and viewed with an optical microscope or an STM 
or AFM before and after exposure to buffer solution con- 
taining a single type of dNTP, together with polymerase 
enzyme and other necessary electrolytes. When an optical 
microscope is used, the single molecule is exposed seri- 
ally to fluorescently-tagged dNTP solutions and as before 
incorporation is sensed by detecting the fluorescent tag 
after excess unreacted dNTP is removed. Again as before, 
the incorporated fluorescent tag must be cleaved and dis- 
carded before a subsequent tag can be detected. Using 



the STM or AFM, the change in length of the primer strand 
is imaged to detect incorporation of the dNTP. Alterna- 
tively the dNTP may be tagged with a physically bulky 
molecule, more readily visible in the STM or AFM, and this 
bulky tag is removed and discarded before each fresh in- 
corporation reaction. 

[0084] when sequencing a single molecular template in this way, 
the possibility of incomplete reaction producing erro- 
neous signal and out-of-phase strand extension, does not 
exist and the consequent limitations on read length do 
not apply. For a single molecular template, reaction either 
occurs or it does not, and if it does not, then extension 
either ceases and is known to cease, or correct extension 
occurs in a subsequent cycle with the correct dNTP. In the 
event that an incorrect nucleotide is incorporated, which 
has the same probability as more the multiple strand pro- 
cesses discussed earlier, for example 1 in 1,000, an error 
is recorded in the sequence, but this error does not prop- 
agate or affect subsequent readout and so the read length 
is not limited by incorrect incorporation. 

[0085] DETECTION AND COMPENSATION FOR DNA POLYMERASE 
ERRORS 

[0086] | n the reactive sequencing process, extension failures will 



typically arise due to the kinetics of the extension reaction 
and limitations on the amount of time allotted for each 
extension trial with the single deoxynucleotide triphos- 
phates (dNTP's). When reaction is terminated by flushing 
away the dNTP supply, some small fraction of the primer 
strands may remain unextended. These strands on subse- 
quent dNTP reaction cycles will continue to extend but will 
be out of phase with the majority strands, giving rise to 
small out-of-phase signals (i.e. signaling a positive incor- 
poration for an added dNTP which is incorrect for exten- 
sion of the majority strands). Because extension failure 
can occur, statistically, on any extension event, these out- 
of-phase signals will increase as the population of strands 
with extension failures grows. Ultimately the out- 
of-phase signal becomes comparable in amplitude with 
the signal due to correct extension of the majority strands 
and the sequence may be unreadable. The length by 
which the primer has been extended when the sequence 
becomes unreadable is known as the sequencing read 
length. 

[0087] The present invention relates to a method that can extend 
the sequencing read length in two ways, first, by discrimi- 
nating between the in-phase and out-of-phase signals, 



and second by calculating where, and how, a dNTP probe 
sequence can be altered so as selectively to extend the 
out-of-phase strands to bring them back into phase with 
the majority strands. 
[0088] Specifically, a method is provided for discriminating be- 
tween the in-phase and out-of-phase sequencing signals 
comprising: (i) detecting and measuring error signals 
thereby determining the size of the trailing strand popu- 
lation; (ii) between the 3' terminus of the trailing strand 
primers and the 3' terminus of the leading strand primers; 
(iii) simulating the occurrence of an extension failure at a 
point upstream from the 3' terminus of the leading 
strands thereby predicting at each extension step the ex- 
act point in the sequence previously traversed by the 
leading strands to which the 3' termini of the trailing 
strands have been extended; (iv) predicting for each dNTP 
introduced the signal to be expected from correct exten- 
sion of the trailing strands; and (v) subtracting the pre- 
dicted signal from the measured signal to yield a signal 
due only to correct extension of the leading strand popu- 
lation. 

[0089] "Upstream" refers to the known sequence of bases cor- 
rectly incorporated onto the primer strands. "Down- 



stream" refers to the sequence beyond the 3' terminus. 
Thus for the leading strand population the downstream 
sequence is unknown but is predetermined by the se- 
quence of the template strand that has not yet been read; 
for the trailing strand population, the downstream se- 
quence is known for the gap between the 3' termini of the 
trailing and leading strands. 
[0090] The gap between the leading and trailing primer strands 
may be 1, 2 or 3 bases (where a single base repeat of any 
length, e.g. AAAA, is counted as a single base because the 
entire repeat will be traversed in a single reaction cycle if 
the correct dNTP is introduced), but can never exceed 3 
bases nor shrink spontaneously to zero if the reaction cy- 
cle of the four dNTP's is unchanged and no other reaction 
errors occur, for example a second extension failure on 
the same primer strand. If the reaction cycle of the four 
dNTP's is unchanged, it may readily be understood that a 
primer strand which has failed to extend when the correct 
dNTP, for example dATP, is in the reaction chamber can- 
not trail the leading (majority) strands (which did extend) 
by more than 3 bases, because the fourth base in the 
dNTP reaction cycle will always once again be the correct 
base (dATP) for the strand which failed to extend previ- 



ously. Similarly, a trailing strand resulting from an exten- 
sion failure can never re-synchronize with the leading 
strands if extension subsequently proceeds correctly, be- 
cause the leading strands will always have extended by at 
least one more nucleotide - G, T, or C in the example dis- 
cussion of an A extension failure - before the trailing 
strand can add the missing A. The effect is that after each 
complete dNTP cycle the trailing strands always follow the 
leading strands by an extension amount that represents 
the bases added in one complete dNTP cycle at a given 
point in the sequence. A further consequence is that all 
trailing strands that have undergone a single failure are in 
phase with each other regardless of the point at which the 
extension failure occurred. 
[0091] The methods described herein may be utilized to signifi- 
cantly extend the read length that can be achieved by the 
technique of reactive sequencing by providing a high level 
of immunity to erroneous signals arising from extension 
failure. In a preferred embodiment of the invention, the 
discrimination method of the invention is computer 
based. 

[0092] First, determination of the readout signals allows real- 
time discrimination between the signals due to correct 



extension of the leading strand population and error sig- 
nals arising from extension of the population of trailing 
strands resulting from extension failure. Using this infor- 
mation, accurate sequence readout can be obtained sig- 
nificantly beyond the point at which the trailing strand 
signals would begin to mask the correct leading strand 
signals. In fact, because the trailing strand signals can al- 
ways be distinguished from the leading strand signals, it 
is possible to allow the trailing strand population to con- 
tinue to grow, at the expense of the leading strands, to 
the point where the sequence is read from the signals 
generated on the trailing strand population, and the lead- 
ing strand signals are treated as error signals to be cor- 
rected for. Ultimately, as the probability that a primer 
strand will have undergone at least one extension failure 
approaches unity, the signals from the leading strand 
population will disappear. Correspondingly the probability 
will increase that a trailing strand will undergo a second 
extension failure; the signals from this second population 
of double failure strands can be monitored and the single 
failure strand signals corrected in just the same way as 
the zero failure strand signals were corrected for signals 
due to single failure strands. 



[0093] Second, because knowledge of the leading strand se- 
quence permits one to know the point to which the trail- 
ing strands have advanced, by simulating the effect of an 
extension failure on that known sequence in a computer, 
and also to know the sequence in the 1, 2 or 3 base gap 
between these strands and the leading strands, then for a 
given template sequence the dNTP probe cycle can be al- 
tered at any point to selectively extend the trailing strands 
while not extending the leading strands, thereby resyn- 
chronizing the populations. Alternatively the gap between 
leading and trailing strands can be simulated in the com- 
puter and the gap can be eliminated by reversing the 
dNTP cycle whenever the gap shrinks to a single base. 
These processes are referred to as "healing." If a large 
number of different sequences are being read in parallel 
with the same dNTP reagents, an altered dNTP probe cycle 
that is correct for healing extension failure strands on a 
given sequence may not be correct for healing other se- 
quences. However, with a large enough number of parallel 
sequence readouts, roughly one-third of the sequences 
will have trailing strands with a 1-base gap at any point, 
and so reversal of the dNTP probe cycle at arbitrary inter- 
vals will heal roughly one-third of the readouts with ex- 



tension failure gaps. Repeated arbitrary reversal of the 
dNTP probe cycle eventually heals roughly two-thirds of 
all the readouts. The overall effect of these error correc- 
tion and error elimination processes is to reduce, or elimi- 
nate any limitation on read length arising from extension 
failure. 

[0094] The ability to overcome the read length limitations im- 
posed by extension failure provides significant additional 
flexibility in experimental design. For example, it may be 
that read length is not limited by extension failure, but 
rather by misincorporation of incorrect nucleotides, which 
shuts down extension on the affected strands and steadily 
reduces the signal, ultimately to the point where it is not 
detectable with the desired accuracy. In this case, the 
ability to eliminate the effects of extension failure allows 
the experimenter great flexibility to alter the reaction 
conditions in such a way that misincorporation is mini- 
mized, at the expense of an increased incidence of exten- 
sion failure. Misincorporation frequency depends in part 
on the concentration of the probing dNTP's and the reac- 
tion time allowed. Longer reaction times, or higher dNTP 
concentrations result in an increased probability of misin- 
corporation, but a reduced incidence of extension failure. 



Therefore, if a higher level of extension failure can be tol- 
erated due to, for example, the computer-aided signal 
discrimination and dNTP cycle-reversal healing methods, 
then reaction times and/or dNTP reagent concentrations 
can be reduced to minimize misincorporation, with the re- 
sulting increase in extension failure being countered by 
the computer-aided signal discrimination and/or dNTP 
cycle-reversal healing techniques described above. 
[0095] if the deoxyribonucleotides used for the polymerase reac- 
tion are impure a small fraction of strands will extend 
when the main nucleotide is incorrect and produce a pop- 
ulation of leading, rather than trailing, error strands. As 
with the trailing strands, the leading strand population is 
never more than three bases, nor less than one base, 
ahead of the main population, unless a second error oc- 
curs on the same strand, and also, regardless of where an 
incorrect extension by an impurity dNTP occurs, the lead- 
ing strands are all in phase with each other. A given base 
site can be probed either 1, 2 or 3 times with an incorrect 
dNTP before it must be extended by the correct dNTP, so 
on the average twice. If each of the incorrect dNTP's is as- 
sumed to carry the same percentage of dNTP impurity, 
then the probability of incorrect extension by, e.g. 99% 



pure dNTP containing the correct complementary base as 
an impurity is 1% -=-3 (only 1/3 of the impurity will be the 
correct complementary base) x 2 (average 2 incorrect tri- 
als between each correct extension), that is, 0.67%. 
[0096] As with trailing strands, the leading strand population can 
produce out-of-phase extension signals that complicate 
the readout of the majority strand sequence, as shown in 
Figure 15. Because the sequence downstream of the 3' 
terminus of the majority strands is not known at the time 
of extension of those strands, the signal due to leading 
strand extension can not immediately be corrected for, 
nor can an altered dNTP cycle be calculated which would 
automatically heal the gap between majority and leading 
strands for a given template sequence. However similar 
methods can be used to ameliorate the effects of a lead- 
ing strand population. First, as with trailing strands, re- 
versal of the dNTP probe cycle automatically heals the gap 
between leading and majority strand populations when- 
ever the gap shrinks to a single base. Therefore, arbitrary 
reversal of the dNTP probe cycle has a 1/3 probability of 
healing the gap for a given sequence, or will heal 1/3 of 
the sequences in a large population of sequences probed 
in parallel. Continued arbitrary reversal eventually heals 



roughly two-thirds of such gaps. Second, although the 
sequence downstream of the 3' terminus of the majority 
strands is not immediately known, information about this 
sequence becomes available as soon as the majority 
strands traverse the gap region. Therefore, for each ex- 
tension of the majority strands it is possible, ideally using 
a computer simulation, to calculate when the leading 
strand population would have traversed that base and 
thus the signal by which a prior extension of the majority 
strands would have been contaminated. In this way the 
majority strand extension signals can retrospectively be 
corrected for leading strand signals. 
[0097] There are important aspects to leading strand creation 

that reduce the frequency of occurrence of leading strand 
events. First, if the concentration of impurity dNTP's is 
sufficiently low, a leading strand population cannot be 
created by impurity extension of the first base of a repeat. 
This is because the probability of incorrect incorporation 
of two impurity bases on the same strand in the same re- 
action cycle is the square of the probability for a single 
incorporation, and therefore vanishingly small for small 
impurity levels. Therefore, whenever the correct dNTP for 
extension of the repeat length is supplied, all strands will 



be extended to completion when the correct nucleotide is 
supplied, regardless of whether some fraction of the 
strands were already partially extended by one base of the 
repeat. Second, not all incorrect extensions result in a 
permanent phase difference. For a permanent phase dif- 
ference to result, a second extension (by a correct base) 
must occur on the leading strand before the main strands 
extend to catch up to the leading strand. Labeling the 
next four sites along the template sequence: 1, 2, 3, 4, 
then, by definition, if a leading strand is created by incor- 
poration of an impurity base on site 1 while the majority 
of the strands do not extend, the main nucleotide sup- 
plied is incorrect for extension at site 1. If the main nu- 
cleotide supplied is correct for extension at site 2, a 
2-base lead is created. There is 1 chance in 4 that the re- 
action chamber contains the correct nucleotide for site 2, 
so the probability of creating a 2-base extension in a sin- 
gle step (with an impurity extension followed by a correct 
extension) is 1/4 the probability of the impurity extension 
alone. For the 0.67% impurity extension probability cited 
above, this means a 0.16% probability of creating a 
2-base extension in a single cycle. 
[0098] However, if the main nucleotide supplied is incorrect for 



further extension at site 2, and, by definition incorrect for 
extension at site 1, then for the lead to become fixed, the 
correct nucleotide for site 2 must be supplied before the 
correct nucleotide to extend at site 1. The probability that 
site 2 will extend before site 1 is therefore 50%; for a 
0.67% impurity extension probability, the probability that 
this creates a fixed lead due to a second extension by a 
correct nucleotide is 0.33%. Overall, a 1% impurity level 
results in ~ 0.5% probability of creating a leading strand in 
any given reaction trial. 

[0099] Preparation of specific embodiments in accordance with 
the present invention will now be described in further de- 
tail. These examples are intended to be illustrative and 
the invention is not limited to the specific materials and 
methods set forth in these embodiments. 

[0100] Example 1 

[0101] a microcalorimetic experiment was performed which 

demonstrates for the first time the successful thermal de- 
tection of a DNA polymerase reaction. The results are 
shown in Figure 3. Approximately 20 units of T7 Seque- 
nase was injected into a 3mL reaction volume containing 
approximately 20nmol of DNA template and complemen- 
tary primer, and an excess of dNTPs. The primer was ex- 



tended by 52-base pairs, the expected length given the 
size of the template. Using a commercial microcalorimeter 
(TAM Model 2273; Thermometries, Sweden) a reaction en- 
thalpy of 3.5-4 kcal per mole of base was measured 
(Figure 3). This measurement is well within the value re- 
quired for thermal detection of DNA polymerase activity. 
This measurement also demonstrates the sensitivity of 
thermopile detection as the maximum temperature rise in 

_3 

the reaction cell was 1x10 C. The lower trace seen in 
Figure 3 is from a reference cell showing the injection ar- 
tifact for an enzyme-free injection into buffer containing 
no template system. 
[0102] Example 2 

[0103] jo illustrate the utility of mutant T4 polymerases, two 

primer extension assays were performed with two differ- 
ent mutant T4 polymerases, both of which are exonucle- 
ase deficient. In one mutant, Aspll2 is replaced with Ala 
and Glull4 is replaced with Ala (D112A/E114A). The ex- 
onuclease activity of this mutant on double-stranded DNA 
is reduced by a factor of about 300 relative to the wild 
type enzyme as described by Reha-Krantz and Nonay 
(1993, J. Biol. Chem. 268:27100-27108). In a second 
polymerase mutant, in addition to the DI12A/E114A 



amino acid substitutions, a third substitution replaces 
Ile417 with Val (I417V/D112A/E114A). The 1417V muta- 
tion increases the accuracy of synthesis by this poly- 
merase (Stocki, S.A. and Reha-Krantz, L.J, 1995, J Mol. 
Biol. 245:15-28; Reha-Krantz, L.J. and Nonay, R.L., 1994, 
J. Biol. Chem. 269:5635-5643) 
[0104] t wo separate primer extension reactions were carried out 
using each of the polymerase mutants. In the first, only a 
single correct nucleotide, dGTP, corresponding to a tem- 
plate C was added; The next unpaired template site is a G 
so that misincorporation would result in formation of a G- 
G mispair. A G-G mispair tends to be among the most dif- 
ficult mispairs for polymerases to make. In the second 
primer extension reaction, two nucleotides, dGTP and 
dCTP, complementary to the first three unpaired template 
sites were added. Following correct incorporation of dGMP 
and dCMP, the next available template site is a T. Forma- 
tion of C-T mispairs tend to be very difficult while G-T 
mispairs tend to be the most frequent mispairs made by 
polymerases. 

[0105] Time courses for primer extension reactions by both mu- 
tant T4 polymerases are shown in Figure 4. Low concen- 
trations of T4 polymerase relative to primer/template (p/t) 



were used so that incorporation reactions could be mea- 
sured on convenient time scales (60 min). By 64 minutes 
98% of the primers were extended. In reactions containing 
only dGTP, both polymerases nearly completely extended 
primer ends by dGMP without any detectable incorpora- 
tion of dGMP opposite G. In reactions containing both 
dGMP and dCMP, both polymerases nearly completely ex- 
tended primer ends by addition of one dGMP and two 
dCMP's. A small percentage («1%) of misincorporation was 
detectable in the reaction catalyzed by the 
D112A/E114Amutant. Significantly, no detectable misin- 
corporation was seen in the reaction catalyzed by the 
I417V/D112A/E114A mutant. 
[0106] Example 3 

[0107] | n accordance with the invention a fluorescent tag may be 
attached to the nucleotide base at a site other than the 3' 
position of the sugar moiety. Chemistries for such tags 
which do not interfere with the activity of the DNA poly- 
merase have been developed as described by Goodwin et 
al. (1995, Experimental Technique of Physics 
41:279-294). Generally the tag is attached to the base by 
a linker arm of sufficient length to move the bulky tag out 
of the active site of the enzyme during incorporation. 



[0108] As illustrated in Figure 5, a nucleotide can be connected 
to a fluorophore by a photocleavable linker, e.g., a ben- 
zoin ester. After the tagged dNMP is incorporated onto 
the 3' end of the DNA primer strand, the DNA template 
system is illuminated by light at a wave length corre- 
sponding to the absorption maximum of the fluorophore 
and the presence of the fluorophore is signaled by detec- 
tion of fluorescence at the emission maximum of the fluo- 
rophore. Following detection of the fluorophore, the linker 
may be photocleaved to produce compound 2; the result 
is an elongated DNA molecule with a modified but non- 
fluorescent nucleotide attached. Many fluorophores, in- 
cluding for example, a dansyl group or acridine, etc., will 
be employed in the methodology illustrated by Figure 5. 

[0109] Alternatively, the DNA template system is not illuminated 
to stimulate fluorescence. Instead, the photocleavage re- 
action is carried out to produce compound 2 releasing the 
fluorophore, which is removed from the template system 
into a separate detection chamber. There the presence of 
the fluorophore is detected as before, by illumination at 
the absorption maximum of the fluorophore and detection 
of emission near the emission maximum of the fluo- 
rophore. 



[0 11 0] Example 4 

In a specific embodiment of the invention, a linked system 
consisting of a chemiluminescently tagged dNTP can con- 
sist of a chemiluminescent group (the dioxetane portion 
of compound 4), a chemically cleavable linker (the silyl 
ether), and an optional photocleavable group (the benzoin 
ester) as depicted in Figure 6. The cleavage of the silyl 
ether by a fluoride ion produces detectable chemilumi- 
nescence as described in Schaap, et al. (1991, "Chemical 
and Enzymatic Triggering of 1, 2-dioxetanes: Structural 
Effects on Chemiluminescence Efficiency" in Biolumines- 
cence & Chemiluminescence, Stanley, P.E. and Knicha, L.J. 
(Eds), Wiley, N.Y. 1991, pp. 103-106). In addition, the 
benzoin ester that links the nucleoside triphosphate to the 
silyl linker is photocleavable as set forth in Rock and Chan 
(1996, J. Org. Chem. 61: 1526-1529); and Felder, et al. 
(1997, First International Electronic Conference on Syn- 
thetic Organic Chemistry, Sept. 1-30). Having both a 
chemiluminescent tag and a photocleavable linker is not 
always necessary; the silyl ether can be attached directly 
to the nucleotide base and the chemiluminescent tag is 
destroyed as it is read. 
[0112] a s illustrated in Figure 6 with respect to compound 3, 



treatment with fluoride ion liberates the phenolate ion of 
the adamantyl dioxetane, which is known to chemilumi- 
nesce with high efficiency (Bronstein et al., 1991, "Novel 
Chemiluminescent Adamantyl 1, 2-dioxetane Enzyme 
Substrates," in Bioluminescence & Chemiluminescence, 
Stanley, P.E. and Kricka, R.J. (eds), Wiley, N.Y. 1991 pp. 
73-82). The other product of the reaction is compound 4, 
which is no longer chemiluminescent. Compound 4 upon 
photolysis at 308-366 nm liberates compound 2. 
[0113] The synthesis of compound 1 is achieved by attachment 
of the fluorophore to the carboxyl group of the benzoin, 
whose a- keto hydroxyl group is protected by 9(FMOC), 
followed by removal of the FMOC protecting group and 
coupling to the nucleotide bearing an activated carbonic 
acid derivative at its 3' end. Compound 4 is prepared via 
coupling of the vinyl ether form of the adamantyl phenol, 
to chloro(3-cyanopropyl)dimethylsilane, reduction of the 
cyano group to the amine, generation of the oxetane, and 
coupling of this chemiluminescence precursor to the nu- 
cleotide bearing an activated carbonic acid derivative at its 
3' end. 

[0114] The chemiluminescent tag can also be attached to the 

dNTP by a cleavable linkage and cleaved prior to detection 



of chemiluminescence. As shown in Figure 7, the benzoin 
ester linkage in compound 3 may be cleaved photolytically 
to produce the free chemiluminescent compound 5. Reac- 
tion of compound 5 with fluoride ion to generate chemilu- 
minescence may then be carried out after compound 5 
has been flushed away from the DNA template primer in 
the reaction chamber. As an alternative to photolytic 
cleavage, the tag may be attached by a chemically cleav- 
able linker which is cleaved by chemical processing which 
does not trigger the chemiluminescent reaction. 
[° 115 ] Examples 

[° 116 ] In this example, the nucleotide sequence of a template 
molecule comprising a portion of DNA of unknown se- 
quence is determined. The DNA of unknown sequence is 
cloned into a single stranded vector such as M13. A 
primer that is complementary to a single stranded region 
of the vector immediately upstream of the foreign DNA is 
annealed to the vector and used to prime synthesis in re- 
active sequencing. For the annealing reaction, equal molar 
ratios of primer and template (calculated based on the ap- 
proximation that one base contributes 330 g/molto the 
molecular weight of a DNA polymer) is mixed in a buffer 
consisting of 67 mM TrisHCI pH 8.8, 16.7 mM (NH ) SO , 



and 0.5 mM EDTA. This buffer is suitable both for anneal- 
ing DNA and subsequent polymerase extension reactions. 
Annealing is accomplished by heating the DNA sample in 
buffer to 80°C and allowing it to slowly cool to room tem- 
perature. Samples are briefly spun in a microcentrifuge to 
remove condensation from the lid and walls of the tube. 
To the DNA is added 0.2 mol equivalents of T4 poly- 
merase mutant I417V/D112A/E114A and buffer compo- 
nents so that the final reaction cell contains 67 mM Tr- 
isHCI pH 8.8, 16.7 mM (NH ) SO , 6.7 mM MgCI 2 and 0.5 
mM dithiothreitol. The polymerase is then queried with 
one dNTP at a time at a final concentration of lOyM. The 
nucleotide is incubated with polymerase at 37°C for 10s. 
Incorporation of dNTPs may be detected by one of the 
methods described above including measuring fluores- 
cence, chemiluminescence or temperature change. The 
reaction cycle will be repeated with each of the four dNTPs 
until the complete sequence of the DNA molecule has 
been determined. 
[O 117 ] Example 6 

[0118] Figure 7 illustrates a mechanical fluorescent sequencing 
method in accordance with the invention. A DNA template 
and primer are captured onto beads 18 using, for exam- 



pie, avidin-biotin or -NH 2 /n-hydroxysuccinimide chem- 
istry and loaded behind a porous frit or filter 20 at the tip 
of a micropipette 22 or other aspiration device as shown 
in Figure 7(a), step 1. Exonuclease deficient polymerase 
enzyme is added and the pipette tip is lowered into a 
small reservoir 24 containing a solution of fluorescently-la- 
beled dNTP. As illustrated in step 2 of Figure 7(a), a small 
quantity of dNTP solution is aspirated through the filter 
and allowed to react with the immobilized DNA. The dNTP 
solution also contains approximately 100 nM polymerase 
enzyme, sufficient to replenish rinsing losses. After reac- 
tion, as shown in step 3, the excess dNTP solution 24 is 
forced back out through the frit 20 into the dNTP reser- 
voir 24. In step 4 of the process the pipette is moved to a 
reservoir containing buffer solution and several aliquots of 
buffer solution are aspirated through the frit to rinse ex- 
cess unbound dNTP from the beads. The buffer inside the 
pipette is then forced out and discarded to waste 26. The 
pipette is moved to a second buffer reservoir (buffer 2), 
containing the chemicals required to cleave the fluores- 
cent tag from the incorporated dNMP. The reaction is al- 
lowed to occur to cleave the tag. As shown in step 5 the 
bead/buffer slurry with the detached fluorescent tag in 



solution is irradiated by a laser or light source 28 at a 
wavelength chosen to excite the fluorescent tag, the fluo- 
rescence is detected by fluorescence detector 30 and 
quantified if incorporation has occurred. 
[° 119 ] Subsequent steps depend on the enzyme strategy used. If 
a single-stage strategy with an exonuclease-deficient 
polymerase is used, as illustrated in Figure 7(b), the solu- 
tion containing the detached fluorescent tag is discarded 
to waste (step 6) which is expelled, followed by a further 
rinse step with buffer 1 (step 7) which is thereafter dis- 
carded (step 8) and the pipette is moved to a second 
reservoir containing a different dNTP (step 9) and the pro- 
cess repeats starting from step 3, cycling through all four 
dNTPs. 

[0120] | n a two-stage strategy, after the correct dNTP has been 
identified and the repeat length quantified in step 5, the 
reaction mixture is rinsed as shown in steps 6, 7, and 8 of 
Figure 7(b) and the pipette is returned to a different 
reservoir containing the same dNTP (e.g., dNTPI) as shown 
in step (a) of Figure 8 to which a quantity of exonuclease- 
proficient polymerase has been added and the solution is 
aspirated for a further stage of reaction which proof- 
reads the prior extension and correctly completes the ex- 



tension. This second batch of dNTP need not be fluores- 
cently tagged, as the identity of the dNTP is known and no 
sequence information will be gained in this proof-reading 
step. If a tagged dNTP is used, the fluorescent tag is 
preferably cleaved and discarded as in step 5 of Figure 
7(a) using Buffer 2. Alternatively, the initial incorporation 
reaction shown in step 2 of Figure 7(a) is carried out for 
long enough, and the initial polymerase is accurate 
enough, so that the additional amount of fluorescent tag 
incorporated with dNTPl at step a of Figure 8 is small and 
does not interfere with quantification of the subsequent 
dNTP. Following proof-reading in step a of Figure 8, ex- 
cess dNTP is expelled (step b) and the reaction mixture is 
rinsed (steps c, d) with a high-salt buffer to dissociate the 
exo+ polymerase from the DNA primer/template. It is im- 
portant not to have exonuclease-proficient enzyme 
present if the DNA primer/template is exposed to an in- 
correct dNTP. The pipette is then moved to step e, in 
which the reservoir contains a different dNTP, and the 
process is repeated, again cycling through all four dNTPs. 
[0121] Example 7 

[0122] a new process for destruction of a fluorophore signal 

which involves reaction of the electronically excited fluo- 



rophore with an electron-abstracting species, such as 
diphenyliodonium salts, is described. 
[0123] The reaction of a diphenyliodonium ion with an electroni- 
cally excited fluorescein molecule is illustrated in Figure 

10. The diphenyliodonium ion extracts an electron from 
the excited state of the fluorescein molecule producing a 
radical ion of the fluorescein molecule and a neutral 
diphenyliodonium free radical. The diphenyliodonium free 
radical rapidly decomposes to iodobenzene and a phenyl 
radical. The fluorescein radical ion then either reacts with 
the phenyl radical or undergoes an internal arrangement 
to produce a final product which is no longer fluorescent. 

[0124] Figures 11 and 12 demonstrate evidence for the specific 
destruction of fluorescein by diphenylionium ion. In Figure 

11, fluorescence spectra are presented for a mixture of 
fluorescein and tetramethylrhodamine dyes, before and 
after addition of a solution of diphenyliodonium chloride. 
It is seen that the fluorescence from the fluorescein dye is 
immediately quenched, demonstrating electron abstrac- 
tion from the excited state of the molecule while the fluo- 
rescence from the rhodamine is unaffected, apart from a 
small decrease due to the dilution of the dye solution by 
the added diphenyliodonium chloride solution. 



[° 125 ] Elimination of the fluorescent signal from the fluorescein 
dye by diphenyliodonium chloride is not in itself proof 
that the fluorescein molecule has been destroyed, because 
electron abstraction from the excited state of fluorescein 
effectively quenches the fluorescence, and quenching 
need not result in destruction of the fluorescein molecule. 
However, Figure 12 demonstrates that the fluorescein 
molecule is destroyed by reaction with the diphenyliodo- 
nium and not simply quenched. Figure 12 demonstrates 
the ultraviolet (UV) absorption spectra for a fluorescein 
solution before and after addition of a solution of 
diphenyliodonium chloride. Spectrum 1 is the UV absorp- 
tion spectrum of a pure fluorescein solution. Spectrum 2 
is the UV absorption of the fluorescein solution following 
the addition of a solution containing a molar excess of 
diphenyliodonium (DPI) chloride and exposure to a single 
flash from a xenon camera strobe. The data show that flu- 
orescein is essentially destroyed by the photochemical re- 
action with the DPI ion. Figure 12 provides clear evidence 
that diphenyliodonium chloride not only quenches the flu- 
orescence from the fluorescein dye but destroys the 
molecule to such an extent that it can no longer act as a 
fluorophore. 



[0126] An experiment was performed to demonstrate efficient 
fluorescent detection and destruction of fluorophore us- 
ing a template sequence. The template, synthesized with a 
alkyl amino linker at the 5' terminus, was: 

3'-H 2 N-(CH 2 ) 7 -GAC CAT TAT AGG TCT TGT TAG GGA A AG GAA GA-5' 



[0127] The trial sequence to be determined is: G GGA AAG GAA 
GA. 

[0128] a tetramethyrhodamine-labeled primer sequence was 

synthesized to be complementary to the template as fol- 
lows: 

5'-[Rhodamine]-(CH 2 )6-CTG GTA ATA TCC AGA ACA AT-3' 



[0129] The alkyl amino-terminated template molecules were 

chemically linked to Sepharose beads derivatized with N- 
hydroxysuccinimide and the rhodamine-labeled primer 



was annealed to the template. The beads with attached 
DNA template and annealed primer were loaded behind a 
B-100 disposable filter in a 5-ml syringe. A volume con- 
taining a mixture of fluorescein-labeled and unlabelled 
dCTP in a ratio of 1:2 and exonuclease-deficient poly- 
merase enzyme in a reaction buffer as specified by the 
manufacturer was drawn into the syringe. Reaction was 
allowed to proceed for 20 minutes, at 35°C. After the re- 
action, the fluid was forced out of the syringe, retaining 
the beads with the reacted DNA behind the filter, and 
three washes with double-distilled water were performed 
by drawing water through the filter into the syringe and 
expelling it. The beads were resuspended in phosphate 
buffer, the filter was removed and the suspension was 
dispensed into a cuvette for fluorescence analysis. Follow- 
ing fluorescence analysis, the bead suspension was loaded 
back into the syringe which was then fitted with a filter 
tip, and the phosphate buffer was dispensed. A solution 
of DPI was drawn up into the syringe with a concentration 
calculated to be in 1:1 molar equivalence to the theoreti- 
cal amount of DNA template, the filter was removed and 
the bead suspension was dispensed into a cuvette for UV 
light exposure for 15 minutes. The suspension was recol- 



lected into a syringe, the filter was reattached, the DPI so- 
lution was expelled, and the beads were resuspended by 
drawing up 0.7 ml_ of phosphate buffer. After removal of 
the filter the bead suspension was dispensed into a clean 
cuvette for fluorescence analysis to check the complete- 
ness of destruction of the fluorescein by the reaction with 
the DPI. A subsequent polymerase reaction was performed 
using the same protocol with labeled dTTP and similarly 
measured for fluorescence. 
[0130] Figure 13 demonstrates the results of the polymerase re- 
actions, with photochemical destruction of the fluorescein 
label by DPI following each nucleotide incorporation reac- 
tion. Curve 1 shows rhodamine fluorescence following an- 
nealing of the rhodamine labeled primer to the beads, 
demonstrating covalent attachment of the template 
strands to the beads and capture of the rhodamine-la- 
beled primer strands. Curve 2 demonstrates detection of 
fluorescein following polymerase-catalyzed incorporation 
of three partially fluorescein-labeled dCMPs onto the 3' 
terminus of the primer strands. Curve 3 shows complete 
destruction of the incorporated fluorescein label by 
photo-induced reaction with diphenyliodonium chloride. 
Loss of rhodamine signal here is attributed to loss of a 



significant fraction of the beads which stuck to the filter 
during washes. Curve 4 shows detection of a new fluores- 
cein label following photochemical destruction of the flu- 
orescein attached to the dCMP's and subsequent poly- 
merase-catalyzed incorporation of three partially fluores- 
cein-labeled dTMPs onto the 3' terminus of the primer 
strands. 

[0131] The following methods were utilized to demonstrate suc- 
cessful destruction of a fluorescein-labeled dTMP. 

[0132] Sepharose beads were purchased from Amersham with 

surfaces derivatized with N-hydroxysuccinimide for reac- 
tion with primary amine groups. The alkyl amino-ter- 
minated templates were chemically linked to the 
Sepharose beads using the standard procedure recom- 
mended by the manufacturer. 

[0133] The beads with attached template were suspended in 250 
mM Tris buffer containing 250 mM NaCI and 40 nM MgCI 
. The solution containing the primer strands was added 
and the mixture heated to 80°C and cooled over ~ 2 hours 
to anneal the primers to the surface-immobilized DNA 
template strands. 

[0134] Fluorescein-labeled dUTP and dCTP were purchased from 
NEN Life Science Products. Unlabeled dTTP and dCTP were 



purchased from Amersham. 

[0135] p r j or to any reaction, the annealed primer/template was 
subjected to fluorescence analysis to ensure that anneal- 
ing had occurred. The excitation wavelength used was 
320 nm and fluorescence from fluorescein and rhodamine 
was detected at ~520 nm and ~580 nm respectively. 

[0136] Reagent volumes were calculated on the assumption that 
the DNA template was attached to the beads with 100% 
efficiency. 

[0137] The 5X reaction buffer contained: 

1) 250 mM Tris buffer, pH 7.5 

2) 250 mM NaCl 

3) 40 mM MgCl 2 

4) 1 mg/mL BSA 

5) 25 mM dithiothreitol (DTT) 

mixed and brought to volume with double-distilled H 2 0 

[0138] T4 DNA polymerase was obtained from Worthington Bio- 
chemical Corp. The polymerase was dissolved in the poly- 
merase buffer according to the manufacturer's protocols. 

[0139] Fluorescein-labeled and unlabeled dCTP's were mixed in a 
ratio of 1:2. 

[0140] The reaction was run in a 5 ml_ syringe (Becton Dickinson) 
fitted with a B-100 disposable filter (Upchurch Scientific). 
This limits the reaction volume to 5 ml_ total: 



Primer template suspension 
T4 DNA Polymerase 
FdCTP/dCTP 
5X reaction buffer 
double-dist. H 2 0 



0.7mL 
1.0 mL 



0.040 mL 



2.0 mL 
1.0 mL 



[0141] The reaction was allowed to proceed in a 35°C oven for 20 
minutes. Following reaction, the fluid was forced out of 
the syringe allowing the filter to retain the beads with the 
reacted DNA. Three washes with double-distilled water 
were performed. All waste was collected and saved for fu- 
ture reuse. The beads were resuspended in 0.7 mL of 
phosphate buffer, the filter was removed and the suspen- 
sion was dispensed into a cuvette for fluorescence analy- 
sis. 

[0142] Following fluorescence analysis the bead suspension was 
collected into a 1 mL syringe (Becton Dickinson) which 
was then fitted with a filter tip. The phosphate buffer was 
dispensed and the waste collected. A solution of 
diphenyliodonium chloride (DPI) was drawn up with a con- 
centration calculated to be in 1:1 molar equivalence to the 
theoretical amount of DNA template (i.e. DPI was present 
in excess of the incorporated fluorescein-labeled dCTP). 
The filter was removed and the bead suspension with 



added DPI was dispensed into a cuvette and exposed to 
UV light for 15 minutes. The suspension was recollected 
into a syringe, the filter reattached, the DPI solution was 
dispensed and the beads were resuspended in 0.7 ml_ of 
phosphate buffer. The bead suspension was dispensed 
into a clean cuvette for fluorescence analysis. 

[0143] it should be noted that a significant fraction of the beads 
used in this procedure appeared to become stuck in the 
filter on the syringe. This resulted in a significant increase 
in the pressure needed to force fluids through the filter as 
it became clogged by the beads, and more importantly re- 
duced the amount of DNA available for fluorescent detec- 
tion of incorporated nucleotides and reduced the weak 
rhodamine signal from the labeled primer to the point 
where it was no longer detectable. 

[0144] Following the successful incorporation reaction with dCTP, 
a subsequent polymerase reaction was run to incorporate 
dTTP. The incorporated fluorescein-labeled dTMP was de- 
tected, but with significantly lower intensity due to the 
losses of the beads in the filter in the multiple transfer 
steps between the reaction syringe and the analysis cu- 
vette. The lowered signal could also result in part from a 
different labeling efficiency of the dTTP and a different in- 



corporation efficiency for the labeled nucleotide in the 
polymerase reaction. Because the rhodamine signal was 
no longer detectable following the second incorporation 
reaction it was not possible to correct for bead losses. 

[0145] The results are shown in Figure 13. The data represented 
by the curves were obtained sequentially as follows: 

[0146] Curve 1 shows the rhodamine fluorescence following an- 
nealing of the rhodamine-labeled primer to the bead- 
immobilized DNA template. 

[0147] Curve 2 demonstrates detection of the fluorescein-labeled 
dCTP following polymerase-catalyzed incorporation of 
three dCMP's onto the 3' terminus of the primer strands. 

[0148] Curve 3 demonstrates complete destruction of the incor- 
porated fluorescein label on the dCMP's by photo-induced 
reaction with dipenyliodonium chloride. In this instance, 
the rhodamine label also has vanished; this is primarily 
because a significant fraction of the beads were lost by 
sticking in the filter used in the reagent flushing opera- 
tion. It is possible that the rhodamine also was destroyed 
by the DPI photochemical reaction. 

[0149] Curve 4 demonstrates detection of a new fluorescein label 
following photochemical destruction of the fluorescein la- 
bel on the dCMP's and polymerase-catalyzed incorpora- 



tion of three fluorescein-tagged dTMP's onto the 3' termi- 
nus of the primer strands. The lower signal compared to 
curve 2 results mainly from the bead losses in the syringe, 
but may also reflect a lower incorporation efficiency of the 
dTMP and/or a lower labeling efficiency. Because the rho- 
damine signal from the labeled primer is no longer de- 
tectable, the bead losses cannot be calibrated. 

[0150] The results shown here demonstrate the concept of reac- 
tive sequencing by fluorescent detection of DNA extension 
followed by photochemical destruction of the fluorophore, 
which allows further extension and detection of a subse- 
quent added fluorophore. This cycle can be repeated a 
large number of times if sample losses are avoided. In 
practical applications of this approach, such losses will be 
avoided by attaching the primer or template strands to the 
fixed surface of an array device, for example a microscope 
slide, and transferring the entire array device between a 
reaction vessel and the fluorescent reader. 

[0">51] Example 8 

[0152] R eac | length is defined as the maximum length of DNA se- 
quence that can be read before uncertainties in the identi- 
ties of the DNA bases exceed some defined level. In the 
reactive sequencing approach, read length is limited by 



two types of polymerase failures: misincorporation, i.e., 
incorrectly incorporating a noncomplementary base, and 
extension failure, i.e. .failure to extend some fraction of 
the DNA primer strands on a given cycle in the presence 
of the correct complementary base. Example 2 demon- 
strated that reaction conditions can be optimized such 
that neither type of failure affects more than ~ 1% of the 
arrayed strands for any given incorporation reaction. Nei- 
ther type of failure directly produces an error signal in the 
sequence readout, because neither a 1% positive signal, 
for a misincorporation, nor a 1% decrease in the signal for 
a correct incorporation, in the case of extension failure, 
will be significant compared to the signals anticipated for 
a correct incorporation. However, accumulated failures 
limit the read length in a variety of different ways. 
[0153] For example, misincorporation inhibits any further exten- 
sion on the affected strand resulting in a reduction in 
subsequent signals. It is estimated that the probability of 
continuing to extend a given strand following a misincor- 
poration is no greater than 0.1%, so that any contribution 
to the fluorescent signal resulting from misincorporation 
followed by subsequent extension of the error strand will 
be negligible. Instead, the accumulation of misincorpora- 



tions resulting in inhibition of strand extension ultimately 
reduces the overall signal amplitude for correct base in- 
corporation to a level at which noise signals in the detec- 
tion system begin to have a significant probability of pro- 
ducing a false signal that is read as a true base incorpora- 
tion. 

[0154] Extension failures typically arise due to the kinetics of the 
extension reaction and limitations on the amount of time 
allotted for each extension trial with the single deoxynu- 
cleotide triphosphates (dNTP's). When reaction is termi- 
nated by flushing away the dNTP supply, a small fraction 
of the primer strands may remain unextended. These 
strands on subsequent dNTP reaction cycles will continue 
to extend but will be out of phase with the majority 
strands, giving rise to small out-of-phase signals, i.e., 
signaling a positive incorporation for an added dNTP 
which is incorrect for extension of the majority strands. 
Because extension failure can occur, statistically, on any 
extension event, the out-of-phase signals will increase as 
the population of strands with extension failures grows. If 
reaction conditions are chosen so that the reaction is 
99.9% complete on a given reaction cycle, for example, 
after a further number, N, of successful extension reac- 



tions, the out-of-phase signal will be approximately (1 - 

N 

0.999 ). The number N at which the out-of-phase signal 
becomes large enough to be incorrectly read as a correct 
extension signal is the read length. For example, after ex- 
tension by 200 bases with 99.9% completion, the out- 
of-phase signal is approximately 18% of the in-phase sig- 
nal, for a single base extension in either case. After ex- 
tension by 400 bases the out-of-phase signal grows to 
33%. The point at which the read must terminate is dic- 
tated by the ability to distinguish the in-phase signals 
from the out-of-phase signals. 
[0155] | n wn at follows, a length of single base repeats, e.g. 
AAAAA, is treated as a single base for the purposes of 
discussing the phase difference between strands. If the 
reaction cycle of the four dNTP's is unchanged, then a 
primer strand which has failed to extend when the correct 
dNTP, for example dATP, is in the reaction cell cannot trail 
the leading, i.e., majority strands, which did extend cor- 
rectly, by more than 3 bases because the fourth base in 
the dNTP reaction cycle will always once again be the cor- 
rect base (dATP) for the strand which failed to extend pre- 
viously. It is assumed that extension failure is purely sta- 
tistical, and that any strand which fails to extend has an 



equal chance of subsequent extension when the correct 
dNTP is supplied, and that this extension probability is 
sufficiently high that the chance of repeated extension 
failures on the same strand is vanishingly small. For ex- 
ample, if the probability of extension failure on a single 
strand is 0.1%, the probability of two extension failures on 

2 —6 

the same strand is (0.001) or 10 . Similarly, the trailing 
strand can never resynchronize with the leading strands if 
extension subsequently proceeds correctly, because the 
leading strands will always have extended by at least one 
more nucleotide - G, T, or C in the example discussion of 
an A extension failure - before the trailing strand can add 
the missing A. The effect is that after each complete dNTP 
cycle the trailing strands always follow the leading strands 
by an extension amount that represents the bases added 
in one complete dNTP cycle at a given point in the se- 
quence. These observations predict that: (i) the gap be- 
tween the leading and trailing strands perpetually oscil- 
lates between 1 and 3 bases and can never increase un- 
less a second extension failure occurs on the same strand; 
and (ii) the gap between the leading and trailing strands is 
independent of the position along the trailing strand at 
which the extension failure occurs. This gap at any given 



point in the extension of the leading strands is solely a 
function of the sequence of the leading strand population 
up to that point and the dNTP probe cycle. In other words, 
a population of trailing strands is produced due to ran- 
dom extension failure at different points in the sequence, 
but these trailing strands themselves are all exactly in 
phase with each other. 

[0156] Because the result of an extension failure is to produce a 
trailing strand population that trails the leading strands 
perpetually by an amount that oscillates between one and 
three nucleotides, assuming that a second extension fail- 
ure does not occur on the trailing strand and that the 
probing dNTP cycle remains unchanged, therefore the gap 
between the leading and trailing strand populations can 
always be known by tracking the leading strand sequence 
by, for example, computer simulation and simulating an 
extension failure event at any point along the sequence. 

[0157] Thus the present invention provides, first, a general 

method of computer tracking of the sequence information 
which allows the out-of-phase error signals due to exten- 
sion of trailing strands to be recognized and subtracted 
from the correct signals, and, second, methods of altering 
the probing dNTP cycle to selectively extend the trailing 



strands so that they move back into phase with the lead- 
ing strands, thus completely eliminating sequence uncer- 
tainty due to out-of-phase signals arising from the trail- 
ing strands that result from extension failure. 
[0158] The statistics which govern the ability to distinguish an 

incorrect signal from out-of-phase strands from a correct 
signal depend upon the noise level and statistical variation 
of the fluorescence signal. Assuming that the signal for a 
correct 1-base extension has a standard deviation of ±5%, 
then statistically 99.75% of the signals will have an ampli- 
tude between 0.85 and 1.15 (± 3 standard deviations 
from the average value) when the average value is 1.0 and 
the standard deviation is 0.05. If the extension signal 
must be at least 85% of the average single extension sig- 
nal to register a correct extension, then statistically a cor- 
rect extension will be missed only 0.13% of the time, i.e. 
the readout accuracy would be 99.87%. Another 0.13% of 
the signals for a correct extension will be greater than 
1.15, but the concern is only with signals that are lower 
than average and so are more difficult to distinguish from 
a growing signal from out-of-phase strands. The statis- 
tics for errors arising from out-of-phase extension of a 
trailing strand are similar. If the standard deviation of the 



trailing strand signals is also ±5% of the mean extension 
signal which will be true whenever the trailing strand in- 
tensity approaches the leading strand intensity, then if the 
trailing strand intensity does not grow beyond 0.7, the 
fraction of trailing strand extensions that give rise to a 
signal of 0.85 or greater 4 standard deviations beyond the 
mean is less than 0.01%. Thus an out-of phase signal 
arising from a single-base extension on one of the three 
sets of trailing strands should be distinguishable from the 
in-phase signal with accuracy so long as the out-of-phase 
signal does not grow beyond - 70% of the in-phase sig- 
nal. 

[0159] The above discussion assumes that all the extension 

events correspond to single base extensions. However, 
multiple single-base repeats are common in DNA se- 
quences, thus one must consider the situation where the 
out-of-phase signal can be M times larger than that for a 
single base extension, where M is the repeat number. For 
example, if the population of one of the three sets of out- 
of-phase strands has grown to 20% of the leading strand 
population, at which level the in-phase and out-of-phase 
signals can readily be distinguished for a single base ex- 
tension, then if this set of out-of-phase strands encoun- 



ters a 5-base repeat, e.g. AAAAA, the signal for that re- 
peat becomes identical in magnitude to that for a single 
base extension on the in-phase strands. Real-time com- 
puter monitoring of the extension signals permits dis- 
crimination against such repeat-enhanced out-of-phase 
signals, for example, by implementing linear and/or non- 
linear auto-regressive moving average (ARMA) schemes. 
The essential points here are as follows (i) the out- 
of-phase strands are those that are trailing the majority 
strands as a result of extension failure; misincorporation 
events which could produce leading error strands have the 
effect of shutting down further extension on the affected 
strands and so do not give rise to significant out- 
of-phase error signals; (ii) there is always only one popu- 
lation of trailing strands regardless of where the exten- 
sion failure occurred; all the primer strands in this popu- 
lation have been extended to the same point which trails 
the leading strand sequence by 1, 2 or 3 bases; and (iii) 
because the leading strands have always previously tra- 
versed the sequence subsequently encountered by the 
trailing strands, the sequence at least 1 base beyond the 
3' terminus of the trailing strands is always known and al- 
lows prediction of exactly whether, and by how much, 



these trailing strands will extend for any nucleotide sup- 
plied, by simulating, in a computer for example, the effect 
of an extension failure at any point in the known se- 
quence upstream of the position to which the leading 
strands have advanced. 
[0160] on each incorporation trial, in addition to any possible 

correct extension signal for the leading strands, there may 
also be an error signal corresponding to extension of the 
trailing strands. For example, let us assume that the trail- 
ing strand population has grown as large as 20% of the 
leading strand population. The size of this population can 
be monitored by detecting the incorporation signal when 
the trailing strands extend and the leading strands do not. 
Assume that the leading strand population has just tra- 
versed a single base repeat region on the template, for 
example AAAAA, and incorporated onto the primer the 
complementary T repeat: TTTTT. The trailing strands will 
not traverse this same AAAAA repeat for at least a com- 
plete cycle of the four probing nucleotides, until the next 
time the strands are probed with dTTP. Knowing the size 
of the trailing strand population from the amplitude of its 
incorporation signals, determined at any point where the 
leading strands do not extend but the trailing strands do, 



the signal to be expected from the trailing strand popula- 
tion due to the TTTTT incorporation can be calculated 
precisely. If the trailing strand population is 1/5 as large 
as the leading strand population, for example, this signal 
will mimic incorporation of a single T on the leading 
strand population. In the absence of the computer-aided 
monitoring method discussed here, such a false signal 
would give rise to a drastic sequence error. 
[0161] Figures 14A and 14B demonstrate how data would appear 
for a sequence: [CTGA] GAA ACC AGA AAG TCC [T], 
probed with a dNTP cycle: CAGT, close to the primer 
where no extension failure has occurred (Figure 14A) and 
well downstream (Figure 14B) at a point where 60% of the 
strands have undergone extension failure and are produc- 
ing out-of-phase signals, and misincorporation has shut 
down extension on 75% of all strands. The readouts 
shown start at the second G in the sequence (beyond the 
[CTGA] sequence in parentheses) and end at the last C 
(before the [T] in parentheses). The digital nature of the 
signal in Figure 14A and also the amplitude scale should 
be noted. In Figure 14B, the signal for a single base ex- 
tension has been reduced by 60%, from 1.0 to 0.4 due to 
the extension failure strands, and by a further factor of 4 



to 0.1 due to misincorporation and the resulting 75% sig- 
nal loss. However, added to the correct extension signals 
are signals due to the out-of-phase extension of the trail- 
ing strands. At first sight, the readout is completely dif- 
ferent from the correct readout shown in Figure 14A, due 
to the superposition of signals produced when the trailing 
strands encounter the sequence previously traversed by 
the leading strands. Particularly large errors arise when- 
ever the trailing strand population encounters the AAA re- 
peats. For example, the second T probe yields a signal 
amplitude corresponding to an AAAAA repeat instead of 
the correct single A, the third G probe gives a signal cor- 
responding to CCC when in fact there is no C at this point 
in the leading strand sequence, the fourth T probe reads 4 
A's when the correct sequence has none (the trailing 
strands encounter the second AAA repeat). However, be- 
cause the sequence from the leading strands is known, 
the false signals arising from the trailing strands can be 
predicted and subtracted from the total signal to obtain 
the correct sequence readout. This is shown in Figure 
14C, where the signals arising from the trailing strands 
are coded by different shading from the leading strand 
signal. Because the signals due to the trailing strands can 



be predicted, the error signals can be subtracted to obtain 
the correct digital sequence readout shown in Figure 14D. 
It should be noted that the data in Figure 14D are now 
identical to those in Figure 14A, and yield the correct se- 
quence readout for the leading strands, the only differ- 
ence being that the overall intensity is reduced due to the 
assumed loss of signal due to misincorporation and ex- 
tension failure, the latter populating the trailing strands. 
In other words, by keeping track of the sequence in a 
computer the effect is as though one could directly visual- 
ize the different contributions as depicted on the plot in 
Figure 14C. Therefore, it is possible to predict for any 
probe nucleotide event exactly what the signal from the 
trailing strand population should be, and subtract this er- 
ror signal from the measured signal to arrive at a true 
digital signal representative of the sequence of the lead- 
ing strand population, which is the desired result. 
[0162] Given the ability to compute and subtract any trailing 

strand signals as discussed, the accuracy with which nu- 
cleotide incorporation or non-incorporation on the lead- 
ing strands can be sensed is limited, not by the absolute 
size of the trailing strand signal, but instead by the noise 
on those signals. For example, assume that the signal for 



a single-base extension of a trailing strand population 
equal to 20% of the leading strand population is 0.2 
±0.05. If the trailing strands encounter a 5-base repeat, 
the resulting signal would be identical in amplitude to that 
produced by a single-base extension of the leading 
strands, but this signal could be subtracted from the ob- 
served signal to yield either a signal resulting from a lead- 
ing strand extension, or a null signal corresponding to no 
extension of the leading strands. Assuming that the noise 
is purely statistical and therefore is reduced in proportion 
to the square root of the signal amplitude, for a 5-base 
extension of the trailing strands or a single extension of 
the leading strands the signal would be 1 ±(0.05 x V5), 
i.e. 1 ±0.11, because the statistical noise on a set of 
added signals grows as the square root of the number of 
signals. One can subtract from this value a correction sig- 
nal which is much more accurately known because the 
trailing strand signal has been repeatedly measured yield- 
ing better statistics on this value. It is assumed that the 
uncertainty in the correction signal is negligible. For no 
extension of the leading strands, the resulting difference 
signal would be 0 ±0.11, whereas a single extension of 
the leading strands would yield a difference signal of 1 



±0.11; the two signals are distinguishable with better 
than 99.9% accuracy. 

[0163] The example given here is an extreme case: in fact, the 
extension failure can be corrected at any point, so that it 
will be possible to minimize the trailing strand population 
below a level where it would produce signals that make 
the leading strand sequence uncertain. 

[0164] There are additional advantages to the computer-aided 
monitoring method proposed. First, the signals from the 
trailing strands serve as an additional check on the lead- 
ing strand sequence. Second, the trailing strand popula- 
tion could be allowed to surpass the leading strand popu- 
lation in magnitude. Without computer-aided monitoring, 
readout would have to cease well before this point, how- 
ever, with computer-aided monitoring, readout can con- 
tinue, now using the trailing strands rather than the lead- 
ing strands to reveal the sequence. Thus, the strand pop- 
ulation that trails due to only one extension failure now 
becomes the leading strand population for the purposes 
of computer aided monitoring. This allows readout to 
continue until further complications arise from the occur- 
rence of 2 extension failures on the same strand, produc- 
ing a new trailing strand population which can be tracked 



in the same way as the single failure strands, while the 
population of strands that have undergone no error failure 
diminishes to the point where it contributes no detectable 
signal. 

[0165] Optimization of reagents, enzyme and reaction conditions 
should allow misincorporation probabilities below 1%, and 
extension failure probabilities as low as 0.1%. The com- 
puter aided monitoring method of the present invention 
additionally provides a means for healing the trailing 
strand population by selectively extending this population 
so that it is again synchronous with the leading strands. 
For example, given a dNTP probe cycle of CCTA, and a 
template sequence (beyond the 3 1 end of the primer) of: 

GTGCAGATCTG ... 



and assuming that when dCTP is in the reaction chamber, 
the polymerase fails to incorporate a C in some fraction of 
the primer strands, the following results: 



Template GTG CAG ATC TG ... 

Main strands C 

Template GTG CAG ATC TG 

Failure strands 



At the end of the first cycle, the main strands have ex- 
tended by ....CA, while the failure strand has not ad- 
vanced. After one more complete cycle, the main strand 
extension is ....CAC and the failure strand now reads 
....CA, i.e. now just one base out of phase. 



Template GTGCAGATCTG. 

Main strands CAC 

Template GTGCAGATCTG. 

Failure strands CA 



Because the phase lag arises from the repeating interac- 
tion of the probe cycle sequence with the template se- 
quence, the unchanged probe cycle can never have the 
correct sequence to resynchronize the strands. Instead, if 
the probe cycle is unchanged, and if no further extension 
failures occur, the phase lag for a given failure strand os- 
cillates perpetually between 1 and 3 bases, counting sin- 



gle base repeats as one base for this purpose. However 
because the leading strand sequence up to the last exten- 
sion is always known, one can determine the effect of in- 
troducing an extension failure at some upstream position. 
It should be noted that an extension failure introduced at 
any arbitrary upstream position, or any base type, always 
produces the same phase lag because the effect of an ex- 
tension failure is to cause extension of the affected strand 
to lag by one complete dNTP cycle. Thus, it is possible to 
alter the probe cycle sequence, for example to probe with 
a C, instead of a G, after the last A in the sequence dis- 
cussed above. The failure strand would advance while the 
main strands did not and the phase lag would heal. In yet 
another embodiment the dNTP probe cycle may be re- 
versed whenever the phase lag shrinks to only 1 base. 
Whenever the phase difference declines to a single base, 
or repeats of a single base, then simply reversing the 
probe cycle sequence always resynchronizes the strands. 
[0166] Figure 15 shows how a leading strand population arising 
from incorrect extension of a fraction of primer strands 
due to nucleotide impurities can adversely affect the sig- 
nals from the main population. Using the same template 
sequence as before: [CTGA] GAA ACC AGA AA GTC C [TC 



AGT] and the same probe cycle: CAGT, the effect of a 
leading strand population which is 20% of the main strand 
population can be simulated and 2 bases ahead of the 
main strands at the time the main strand sequence begins 
to be read. The leading strands have already extended by 
-C TTT. The first C probe extends the main primer 
strands by one base complementary to the first G in the 
sequence giving a single base extension signal of 1. The 
first G extends the leading strands by -GG- complemen- 
tary to the -CC- repeat, giving a signal of 0.4. Greater 
ambiguity arises when the leading strands encounter the 
second AAA-repeat at the second T probe, increasing the 
main strand signal from the correct value for a single base 
extension to 1.6. In the absence of further information, 
this value will be ambiguous or may be interpreted as a 
2-base repeat. 

[0167] Correction for these ambiguities comes from the fact that 
the correct sequence of the main strands is read following 
the leading strand read. In general, a large multiple repeat 
which can give an error signal when encountered by the 
leading strands will subsequently give a larger signal 
when encountered by the main strands, and superim- 
posed on this correct signal will be a leading strand signal 



for which there are three possibilities: (i) zero signal: the 
leading strands do not extend; (ii) small signal that does 
not create ambiguity -the leading strands extend by a 
single base or a repeat number small enough not to simu- 
late an additional base extension of the main strands; (iii) 
large signal; the leading strands encounter a second large 
repeat. By monitoring the main strand sequence, it is pos- 
sible at each extension to retroactively estimate the ef- 
fects of a leading strand population and subtract such 
signals from the main strand signals to arrive at a correct 
sequence. In the case where the leading strands encounter 
a repeat large enough to create ambiguity in the se- 
quence, even if the leading strands subsequently en- 
counter a second or third large repeat when the main 
strands encounter the first repeat, the main strands will 
eventually traverse the same region to give sufficient in- 
formation to derive the correct sequence. In other words, 
the sequence information at any point is always overde- 
termined - the signal for any given extension is always 
read twice, by the leading strands and the main strands, 
and so yields sufficient information to determine both the 
correct sequence and the fractional population of the 
leading strands, which are the two pieces of information 



required. 

[0168] Because the sequence of the leading strand population 

produced by impure nucleotides cannot be known until it 
is subsequently traversed by the main strands, one cannot 
know what dNTP probe cycle would act to extend the main 
strands while not extending the leading strands, as was 
the case for a trailing strand population produced by ex- 
tension failure. However, as with trailing strands, the gap 
between the leading and main strands oscillates perpetu- 
ally between one and three bases, and can be reconnected 
by reversing the dNTP probe sequence whenever the gap 
between the leading and main strands shrinks to a single 
base. Although it cannot be known when this single base 
gap occurs, the dNTP probe sequence can be reversed at 
regular intervals. Trials indicate that such a process ulti- 
mately reconnects approximately 2/3 of the leading 
strands. The statistics for this process are as follows. 

[° 169 ] Statistically, because the gap between the main and lead- 
ing strands can be 1, 2 or 3 bases, there is a 1/3 proba- 
bility that the leading strand population will have only a 
1-base phase lag at any time the cycle is reversed. The 
1-base phase difference will always be healed by a cycle 
reversal. Another 1/3 of the time the leading strands are 2 



bases ahead at the time the cycle is reversed. For the next 
probing base the following possibilities exist: 



Lead 


Main 




strand 


strand 




0 


0 


No extension on either strand: Prob. 3/4 x 3/4 = 9/16 


+1 


0 


Phase lag increases: Prob. 1/4 x 3/4 = 3/12 


+1 


+1 


Both strands advance: Prob. 1/4 x 1/4 = 1/16 


0 


+1 


Phase lag decreases: Prob. 3/4 x 1/4 = 3/12 



Phase lag stays at 2: Number of chances = 10/16 
Phase lag decreases Number of chances = 3/1 2 
Phase lag increases Number of chances = 3/1 2 



So the chance of making a 2-base gap worse is 
(3/12)/(10/16 + 3/12) = 28%. Considering all three gap 
sizes: 1-base gap heals (33% of population); 2-base gap 
gets worse 28% of the time: only 1/3 of gaps are 2 base, 
so 9% total get worse; 3-base gap also gets worse 28% of 
the time, again 9% overall effect. In sum, 33% heal at a 
given reversal, 18% lose at a given reversal and the re- 
maining 50% are unchanged. Even assuming the 18% are 
permanently lost (and a 2 base gap increased to a 3 base 
gap can still rejoin), at each subsequent reversal 1/3 of the 



50% of strands are healed, which are unchanged by the 
previous reversal, as follows: 



Reversal # Fraction of gaps healed 

1 33% 

2 17% 

3 9% 

4 4.5% 

5 2.5% 

6 1% 
Total -67% 



Therefore, repeated reversal of the dNTP probe cycle can 
reduce by 2/3 the effects of out-of-phase signals due to 
incorrect extension by nucleotide impurities, or random 
extension failure, effectively increasing the read length 
when limited by either effect by a factor of 3. 
[0170] Although the invention has been described herein with 
reference to specific embodiments, many modifications 
and variations therein will readily occur to those skilled in 
the art. Accordingly, all such variations and modifications 
are included within the intended scope of the invention. 



